http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/dev/provider/index.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/dev/provider/index.asciidoc b/docs/src/dev/provider/index.asciidoc index de8be1f..c64261f 100644 --- a/docs/src/dev/provider/index.asciidoc +++ b/docs/src/dev/provider/index.asciidoc @@ -20,8 +20,7 @@ image::apache-tinkerpop-logo.png[width=500,link="http://tinkerpop.apache.org"] :toc-position: left -Provider Documentation -====================== += Provider Documentation TinkerPop exposes a set of interfaces, protocols, and tests that make it possible for third-parties to build libraries and systems that plug-in to the TinkerPop stack. TinkerPop refers to those third-parties as "providers" and this @@ -38,8 +37,7 @@ This document attempts to address the needs of the different providers that have * Graph Plugin Provider [[graph-system-provider-requirements]] -Graph System Provider Requirements ----------------------------------- +== Graph System Provider Requirements image:tinkerpop-enabled.png[width=140,float=left] At the core of TinkerPop3 is a Java8 API. The implementation of this core API and its validation via the `gremlin-test` suite is all that is required of a graph system provider wishing to @@ -48,8 +46,7 @@ provided by TinkerPop (e.g. Gremlin Console, Gremlin Server, etc.) and 3rd-party Gremlin-JS, etc.) will integrate properly. Finally, please feel free to use the logo on the left to promote your TinkerPop3 implementation. -Implementing Gremlin-Core -~~~~~~~~~~~~~~~~~~~~~~~~~ +=== Implementing Gremlin-Core The classes that a graph system provider should focus on implementing are itemized below. It is a good idea to study the link:http://tinkerpop.apache.org/docs/x.y.z/reference/#tinkergraph-gremlin[TinkerGraph] (in-memory OLTP and OLAP @@ -77,8 +74,7 @@ not efficient for the implementation, override them. * `ComputerGraph` is a `Wrapper` system that ensure proper semantics during a GraphComputer computation. [[oltp-implementations]] -OLTP Implementations -^^^^^^^^^^^^^^^^^^^^ +==== OLTP Implementations image:pipes-character-1.png[width=110,float=right] The most important interfaces to implement are in the `structure/` package. These include interfaces like Graph, Vertex, Edge, Property, Transaction, etc. The `StructureStandardSuite` @@ -87,8 +83,7 @@ classes with static exceptions that should be thrown by the graph system so that messages are consistent amongst all TinkerPop3 implementations. [[olap-implementations]] -OLAP Implementations -^^^^^^^^^^^^^^^^^^^^ +==== OLAP Implementations image:furnace-character-1.png[width=110,float=right] Implementing the OLAP interfaces may be a bit more complicated. Note that before OLAP interfaces are implemented, it is necessary for the OLTP interfaces to be, at minimal, @@ -111,8 +106,7 @@ link:http://tinkerpop.apache.org/docs/x.y.z/reference/#sparkgraphcomputer[SparkG Given the complexity of the OLAP system, it is good to study and copy many of the patterns used in these reference implementations. -Implementing GraphComputer -++++++++++++++++++++++++++ +===== Implementing GraphComputer image:furnace-character-3.png[width=150,float=right] The most complex method in GraphComputer is the `submit()`-method. The method must do the following: @@ -129,8 +123,7 @@ image:furnace-character-3.png[width=150,float=right] The most complex method in . Update Memory with runtime information. . Construct a new `ComputerResult` containing the compute Graph and Memory. -Implementing Memory -+++++++++++++++++++ +===== Implementing Memory image:gremlin-brain.png[width=175,float=left] The Memory object is initially defined by `VertexProgram.setup()`. The memory data is available in the first round of the `VertexProgram.execute()` method. Each Vertex, when executing @@ -138,15 +131,13 @@ the VertexProgram, can update the Memory in its round. However, the update is no the next round. At the end of the first round, all the updates are aggregated and the new memory data is available on the second round. This process repeats until the VertexProgram terminates. -Implementing Messenger -++++++++++++++++++++++ +===== Implementing Messenger The Messenger object is similar to the Memory object in that a vertex can read and write to the Messenger. However, the data it reads are the messages sent to the vertex in the previous step and the data it writes are the messages that will be readable by the receiving vertices in the subsequent round. -Implementing MapReduce Emitters -+++++++++++++++++++++++++++++++ +===== Implementing MapReduce Emitters image:hadoop-logo-notext.png[width=150,float=left] The MapReduce framework in TinkerPop3 is similar to the model popularized by link:http://hadoop.apache.org[Hadoop]. The primary difference is that all Mappers process the vertices @@ -290,8 +281,7 @@ for (final MapReduce mapReduce : mapReducers) { <2> If there is no reduce stage, the map-stage results are inserted into Memory as specified by the application developer's `MapReduce.addResultToMemory()` implementation. -Hadoop-Gremlin Usage -^^^^^^^^^^^^^^^^^^^^ +==== Hadoop-Gremlin Usage Hadoop-Gremlin is centered around `InputFormats` and `OutputFormats`. If a 3rd-party graph system provider wishes to leverage Hadoop-Gremlin (and its respective `GraphComputer` engines), then they need to provide, at minimum, a @@ -325,8 +315,7 @@ case, then the `Configuration` provided to `HadoopGraph.open()` should be dynami determine how to read and write data to and from Hadoop. For instance, `gremlin.hadoop.graphReader` and `gremlin.hadoop.graphWriter`. -GraphFilterAware Interface -++++++++++++++++++++++++++ +===== GraphFilterAware Interface <<graph-filter,Graph filters>> by OLAP processors to only pull a subgraph of the full graph from the graph data source. For instance, the example below constructs a `GraphFilter` that will only pull the "knows"-graph amongst people into the `GraphComputer` @@ -347,8 +336,7 @@ if (configuration.containsKey(Constants.GREMLIN_HADOOP_GRAPH_FILTER)) this.graphFilter = VertexProgramHelper.deserialize(configuration, Constants.GREMLIN_HADOOP_GRAPH_FILTER); ---- -PersistResultGraphAware Interface -+++++++++++++++++++++++++++++++++ +===== PersistResultGraphAware Interface A graph system provider's `OutputFormat` should implement the `PersistResultGraphAware` interface which determines which persistence options are available to the user. For the standard file-based `OutputFormats` provided @@ -358,8 +346,7 @@ data files are not random access and are, in essence, immutable. Thus, these fil `ResultGraph.NEW` which creates a copy of the data specified by the `Persist` enum. [[io-implementations]] -IO Implementations -^^^^^^^^^^^^^^^^^^ +==== IO Implementations If a `Graph` requires custom serializers for IO to work properly, implement the `Graph.io` method. A typical example of where a `Graph` would require such a custom serializers is if their identifier system uses non-primitive values, @@ -415,8 +402,7 @@ implementation remotely don't need a full dependency on the entire `Graph` - jus classes being serialized. [[remoteconnection-implementations]] -RemoteConnection Implementations -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +==== RemoteConnection Implementations A `RemoteConnection` is an interface that is important for usage on traversal sources configured using the link:http://tinkerpop.apache.org/docs/x.y.z/reference/#connecting-via-remotegraph[withRemote()] option. A `Traversal` @@ -452,8 +438,7 @@ similar to Gremlin Server that can accept a serialized `Traversal` instance, the reason to implement this interface. [[validating-with-gremlin-test]] -Validating with Gremlin-Test -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +=== Validating with Gremlin-Test image:gremlin-edumacated.png[width=225] @@ -597,8 +582,7 @@ environment variable, depending on your project layout. Some tests require this creating temporary files. The value is typically set to the project build directory. For example using the Maven SureFire Plugin, this is done via the configuration argLine with `-Dbuild.dir=${project.build.directory}`. -Accessibility via GremlinPlugin -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +=== Accessibility via GremlinPlugin image:gremlin-plugin.png[width=100,float=left] The applications distributed with TinkerPop3 do not distribute with any graph system implementations besides TinkerGraph. If your implementation is stored in a Maven repository (e.g. @@ -641,8 +625,7 @@ gremlin> :plugin use tinkerpop.neo4j gremlin> g = Neo4jGraph.open('/tmp/neo4j') ==>neo4jgraph[EmbeddedGraphDatabase [/tmp/neo4j]] -In-Depth Implementations -~~~~~~~~~~~~~~~~~~~~~~~~ +=== In-Depth Implementations image:gremlin-painting.png[width=200,float=right] The graph system implementation details presented thus far are minimum requirements necessary to yield a valid TinkerPop3 implementation. However, there are other areas that a @@ -657,8 +640,7 @@ lookup becomes an `O(log(|V|))`. Please review `TinkerGraphStepStrategy` for ide ultimately referenced by the `GraphTraversal` interface. It is possible to extend `GraphTraversal` to use a graph system specific step implementation. -Graph Driver Provider Requirements ----------------------------------- +== Graph Driver Provider Requirements image::gremlin-server-protocol.png[width=325] @@ -786,14 +768,12 @@ Gremlin Server will send: NOTE: Please refer to the link:http://tinkerpop.apache.org/docs/current/dev/io[IO Reference Documentation] for more examples of `RequestMessage` and `ResponseMessage` instances. -OpProcessors Arguments -~~~~~~~~~~~~~~~~~~~~~~ +=== OpProcessors Arguments The following sections define a non-exhaustive list of available operations and arguments for embedded `OpProcessors` (i.e. ones packaged with Gremlin Server). -Common -^^^^^^ +==== Common All `OpProcessor` instances support these arguments. @@ -803,8 +783,7 @@ All `OpProcessor` instances support these arguments. |batchSize |Int |When the result is an iterator this value defines the number of iterations each `ResponseMessage` should contain - overrides the `resultIterationBatchSize` server setting. |========================================================= -Standard OpProcessor -^^^^^^^^^^^^^^^^^^^^ +==== Standard OpProcessor The "standard" `OpProcessor` handles requests for the primary function of Gremlin Server - executing Gremlin. Requests made to this `OpProcessor` are "sessionless" in the sense that a request must encapsulate the entirety @@ -846,8 +825,7 @@ to send an alias pair with key of "g" and value of "g2" and thus allow the scrip |scriptEvaluationTimeout |Long |An override for the server setting that determines the maximum time to wait for a script to execute on the server. |========================================================= -Session OpProcessor -^^^^^^^^^^^^^^^^^^^ +==== Session OpProcessor The "session" `OpProcessor` handles requests for the primary function of Gremlin Server - executing Gremlin. It is like the "standard" `OpProcessor`, but instead maintains state between sessions and allows the option to leave all @@ -906,8 +884,7 @@ of resources which can be desirable if Gremlin Server has a long session timeout as attempts to close long run jobs can occur more rapidly. If not provided, this value is `false`. |========================================================= -Traversal OpProcessor -^^^^^^^^^^^^^^^^^^^^^ +==== Traversal OpProcessor Both the Standard and Session OpProcessors allow for Gremlin scripts to be submitted to the server. The `TraversalOpProcessor` however allows Gremlin `Bytecode` to be submitted to the server. Supporting this `OpProcessor` @@ -1029,8 +1006,7 @@ rolled back up into a single object or simply left as-is. There are four values |sideEffect |UUID | *Required* The unique identifier for the request that original submitted the traversal (side-effects are keyed by that value) |========================================================= -Authentication -~~~~~~~~~~~~~~ +=== Authentication Gremlin Server supports link:https://en.wikipedia.org/wiki/Simple_Authentication_and_Security_Layer[SASL-based] authentication. A SASL implementation provides a series of challenges and responses that a driver must comply with @@ -1050,8 +1026,7 @@ authentication. If it cannot authenticate given the challenge response from the NOTE: Gremlin Server does not support the "authorization identity" as described in link:https://tools.ietf.org/html/rfc4616[RFC4616]. [[gremlin-plugins]] -Gremlin Plugins ---------------- +== Gremlin Plugins image:gremlin-plugin.png[width=125]
http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/index.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/index.asciidoc b/docs/src/index.asciidoc index 4ff4ebb..eb76a7b 100644 --- a/docs/src/index.asciidoc +++ b/docs/src/index.asciidoc @@ -19,8 +19,7 @@ image::apache-tinkerpop-logo.png[width=500] *x.y.z* -TinkerPop Compendium --------------------- +== TinkerPop Compendium image::tinkerpop-reading.png[width=800,align="center"] @@ -50,8 +49,7 @@ Note the "+" following the link in each table entry - it forces an asciidoc line //// [[tutorials]] -Tutorials -~~~~~~~~~ +=== Tutorials [width="100%",cols="<.<3,<.^10",grid="none"] |========================================================= @@ -71,8 +69,7 @@ and an overview of Gremlin. (*external*) |========================================================= [[publications]] -Publications -~~~~~~~~~~~~ +=== Publications Unless otherwise noted, all "publications" are externally managed: @@ -91,8 +88,7 @@ Unless otherwise noted, all "publications" are externally managed: * Rodriguez, M.A., Kuppitz, D., Yim, K., link:http://www.datastax.com/dev/blog/tales-from-the-tinkerpop["Tales from the TinkerPop,"] DataStax Engineering Blog, July 2015. [[developer]] -Developer -~~~~~~~~~ +=== Developer [width="100%",cols="<.<3,<.^10",grid="none"] |========================================================= @@ -102,4 +98,4 @@ Provides information on ways to contribute to TinkerPop as well as details on bu Documentation for providers who implement the TinkerPop interfaces, develop plugins or drivers, or provide other third-party libraries for TinkerPop. |image:gremlin-io2.png[width=200] |link:http://tinkerpop.apache.org/docs/x.y.z/dev/io/[IO Reference] + Reference Documentation for providers and users of the various IO formats that TinkerPop has: GraphML, GraphSON and Gryo. -|========================================================= \ No newline at end of file +|========================================================= http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/appendix.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/appendix.asciidoc b/docs/src/recipes/appendix.asciidoc index a65b8f8..3cebe8a 100644 --- a/docs/src/recipes/appendix.asciidoc +++ b/docs/src/recipes/appendix.asciidoc @@ -14,8 +14,7 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. //// -Appendix -======== += Appendix Many of the recipes are based on questions and answers provided on the link:https://groups.google.com/forum/#!forum/gremlin-users[gremlin-users mailing list] or on http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/between-vertices.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/between-vertices.asciidoc b/docs/src/recipes/between-vertices.asciidoc index 5752f3e..c8fa890 100644 --- a/docs/src/recipes/between-vertices.asciidoc +++ b/docs/src/recipes/between-vertices.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[between-vertices]] -Between Vertices ----------------- +== Between Vertices It is quite common to have a situation where there are two particular vertices of a graph and a need to execute some traversal on the paths found between them. Consider the following examples using the modern toy graph: @@ -116,4 +115,4 @@ g.V(vRexsterJob1, vBlueprintsJob1).as('job'). While the traversals above are more complex, the pattern for finding "things" between two vertices is largely the same. Note the use of the `where()` step to terminate the traversers for a specific user. It is embedded in a `coalesce()` step to handle situations where the specified user did not complete an application for the specified job and will -return `false` in those cases. \ No newline at end of file +return `false` in those cases. http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/centrality.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/centrality.asciidoc b/docs/src/recipes/centrality.asciidoc index 276bd62..30e751a 100644 --- a/docs/src/recipes/centrality.asciidoc +++ b/docs/src/recipes/centrality.asciidoc @@ -15,16 +15,14 @@ See the License for the specific language governing permissions and limitations under the License. //// [[centrality]] -Centrality ----------- +== Centrality There are many measures of link:https://en.wikipedia.org/wiki/Centrality[centrality] which are meant to help identify the most important vertices in a graph. As these measures are common in graph theory, this section attempts to demonstrate how some of these different indicators can be calculated using Gremlin. [[degree-centrality]] -Degree Centrality -~~~~~~~~~~~~~~~~~ +=== Degree Centrality link:https://en.wikipedia.org/wiki/Centrality#Degree_centrality[Degree centrality] is a measure of the number of edges associated to each vertex. The following examples use the modern toy graph: @@ -56,8 +54,7 @@ and as a result, the grouping will be on the incoming `Vertex` object itself. Th stored in the `Map` for each key. [[betweeness-centrality]] -Betweeness Centrality -~~~~~~~~~~~~~~~~~~~~~ +=== Betweeness Centrality link:https://en.wikipedia.org/wiki/Betweenness_centrality[Betweeness centrality] is a measure of the number of times a vertex is found between the <<shortest-path,shortest path>> of each vertex pair in a graph. Consider the following @@ -111,8 +108,7 @@ and 8049 edges already require a massive amount of compute resources to determin pairs). [[closeness-centrality]] -Closeness Centrality -~~~~~~~~~~~~~~~~~~~~ +=== Closeness Centrality link:https://en.wikipedia.org/wiki/Centrality[Closeness centrality] is a measure of the distance of one vertex to all other reachable vertices in the graph. The following examples use the modern toy graph: @@ -151,8 +147,7 @@ and 8049 edges already require a massive amount of compute resources to determin pairs). [[eigenvector-centrality]] -Eigenvector Centrality -~~~~~~~~~~~~~~~~~~~~~~ +=== Eigenvector Centrality A calculation of link:https://en.wikipedia.org/wiki/Centrality#Eigenvector_centrality[eigenvector centrality] uses the relative importance of adjacent vertices to help determine their centrality. In other words, unlike http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/connected-components.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/connected-components.asciidoc b/docs/src/recipes/connected-components.asciidoc index 250258d..bd05cf7 100644 --- a/docs/src/recipes/connected-components.asciidoc +++ b/docs/src/recipes/connected-components.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[connected-components]] -Connected Components --------------------- +== Connected Components Gremlin can be used to find link:https://en.wikipedia.org/wiki/Connected_component_(graph_theory)[connected components] in a graph. Consider the following graph which has three connected components: @@ -79,4 +78,4 @@ g.withComputer().V().emit(cyclicPath().or().not(both())).repeat(both()).until(cy filter(unfold().where(eq("v"))). unfold().dedup().order().by(id).fold() ).toSet() ----- \ No newline at end of file +---- http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/cycle-detection.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/cycle-detection.asciidoc b/docs/src/recipes/cycle-detection.asciidoc index f18b358..741e1ec 100644 --- a/docs/src/recipes/cycle-detection.asciidoc +++ b/docs/src/recipes/cycle-detection.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[cycle-detection]] -Cycle Detection ---------------- +== Cycle Detection A cycle occurs in a graph where a path loops back on itself to the originating vertex. For example, in the graph depticted below Gremlin could be use to detect the cycle among vertices `A-B-C`. @@ -113,4 +112,4 @@ g.V().sideEffect(outE("bridge").aggregate("bridges")).barrier(). select("bridges").count(local).where(eq("c"))).limit(1). path().by(id).by(constant(" -> ")). map {String.join("", it.get().objects())} ----- \ No newline at end of file +---- http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/duplicate-edge.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/duplicate-edge.asciidoc b/docs/src/recipes/duplicate-edge.asciidoc index a62716e..ff31fc5 100644 --- a/docs/src/recipes/duplicate-edge.asciidoc +++ b/docs/src/recipes/duplicate-edge.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[duplicate-edge]] -Duplicate Edge Detection ------------------------- +== Duplicate Edge Detection Whether part of a graph maintenance process or for some other analysis need, it is sometimes necessary to detect if there is more than one edge between two vertices. The following examples will assume that an edge with the same @@ -157,4 +156,4 @@ g.withoutStrategies(LazyBarrierStrategy, PathRetractionStrategy).V().as("ov"). where(outV().as("ov")).as("e2"). filter(select("e1","e2").by(label).where("e1", eq("e2"))). filter(select("e1","e2").by("weight").where("e1", eq("e2"))).valueMap(true) ----- \ No newline at end of file +---- http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/duplicate-vertex.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/duplicate-vertex.asciidoc b/docs/src/recipes/duplicate-vertex.asciidoc index 5730eee..0ac2e9a 100644 --- a/docs/src/recipes/duplicate-vertex.asciidoc +++ b/docs/src/recipes/duplicate-vertex.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[duplicate-vertex]] -Duplicate Vertex Detection --------------------------- +== Duplicate Vertex Detection The pattern for finding duplicate vertices is quite similar to the pattern defined in the <<duplicate-edge,Duplicate Edge>> section. The idea is to extract the relevant features of the vertex into a comparable list that can then be used to http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/if-then-based-grouping.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/if-then-based-grouping.asciidoc b/docs/src/recipes/if-then-based-grouping.asciidoc index dc0025f..0336b13 100644 --- a/docs/src/recipes/if-then-based-grouping.asciidoc +++ b/docs/src/recipes/if-then-based-grouping.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[if-then-based-grouping]] -If-Then Based Grouping ----------------------- +== If-Then Based Grouping Consider the following traversal over the "modern" toy graph: @@ -68,4 +67,4 @@ g.V().hasLabel("person"). constant("very old"))) ---- -The answer is the same, but this traversal removes the nested `choose`, which makes it easier to read. \ No newline at end of file +The answer is the same, but this traversal removes the nested `choose`, which makes it easier to read. http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/index.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/index.asciidoc b/docs/src/recipes/index.asciidoc index 31095c0..b98779f 100644 --- a/docs/src/recipes/index.asciidoc +++ b/docs/src/recipes/index.asciidoc @@ -20,8 +20,7 @@ image::apache-tinkerpop-logo.png[width=500,link="http://tinkerpop.apache.org"] :toc-position: left -Recipes -======= += Recipes image:gremlin-chef.png[width=120,float=left] All programming languages tend to have link:https://en.wikipedia.org/wiki/Software_design_pattern[patterns of usage] for commonly occurring problems. Gremlin @@ -33,8 +32,7 @@ Recipes assume general familiarity with Gremlin and the TinkerPop stack. Be sure link:http://tinkerpop.apache.org/docs/x.y.z/tutorials/getting-started[Getting Started] tutorial and the link:http://tinkerpop.apache.org/docs/x.y.z/tutorials/the-gremlin-console/[The Gremlin Console] tutorial. -Traversal Recipes -================= += Traversal Recipes include::between-vertices.asciidoc[] @@ -60,16 +58,14 @@ include::traversal-induced-values.asciidoc[] include::tree.asciidoc[] -Implementation Recipes -====================== += Implementation Recipes include::style-guide.asciidoc[] include::traversal-component-reuse.asciidoc[] [[contributing]] -How to Contribute a Recipe -========================== += How to Contribute a Recipe Recipes are generated under the same system as all TinkerPop documentation and is stored directly in the source code repository. TinkerPop documentation is all link:http://asciidoc.org/[asciidoc] based and can be generated locally with @@ -137,4 +133,4 @@ GitHub and JIRA are linked. As mentioned earlier in this section, the recipe wi committers prior to merging. This process may take several days to complete. We look forward to receiving your submissions! -include::appendix.asciidoc[] \ No newline at end of file +include::appendix.asciidoc[] http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/pagination.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/pagination.asciidoc b/docs/src/recipes/pagination.asciidoc index 510a586..13262e4 100644 --- a/docs/src/recipes/pagination.asciidoc +++ b/docs/src/recipes/pagination.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[pagination]] -Pagination ----------- +== Pagination image:gremlin-paging.png[float=left,width=330]In most database applications, it is oftentimes desireable to return discrete blocks of data for a query rather than all of the data that the total results would contain. This approach to @@ -76,4 +75,4 @@ The only way to completely avoid that problem is to re-use the same traversal in t = g.V().hasLabel('person');[] t.next(2) t.next(2) ----- \ No newline at end of file +---- http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/recommendation.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/recommendation.asciidoc b/docs/src/recipes/recommendation.asciidoc index 0aaa7e4..7d93eb9 100644 --- a/docs/src/recipes/recommendation.asciidoc +++ b/docs/src/recipes/recommendation.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[recommendation]] -Recommendation --------------- +== Recommendation image:gremlin-recommendation.png[float=left,width=180]One of the more common use cases for a graph database is the development of link:https://en.wikipedia.org/wiki/Recommender_system[recommendation systems] and a simple approach to @@ -288,4 +287,4 @@ In using sampling methods, it is important to consider that the natural ordering an ideal sample for the recommendation. For example, if the edges end up being returned oldest first, then the recommendation will be based on the oldest data, which would not be ideal. As with any traversal, it is important to understand the nature of the graph being traversed and the behavior of the underlying graph database to properly -achieve the desired outcome. \ No newline at end of file +achieve the desired outcome. http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/shortest-path.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/shortest-path.asciidoc b/docs/src/recipes/shortest-path.asciidoc index b9ecec8..04301f1 100644 --- a/docs/src/recipes/shortest-path.asciidoc +++ b/docs/src/recipes/shortest-path.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[shortest-path]] -Shortest Path -------------- +== Shortest Path image:shortest-path.png[width=300] http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/style-guide.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/style-guide.asciidoc b/docs/src/recipes/style-guide.asciidoc index 14cbad9..6da682d 100644 --- a/docs/src/recipes/style-guide.asciidoc +++ b/docs/src/recipes/style-guide.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[style-guide]] -Style Guide ------------ +== Style Guide Gremlin is a data flow language where each new step concatenation alters the stream accordingly. This aspect of the language allows users to easily "build-up" a traversal (literally) step-by-step until the expected results are @@ -66,8 +65,7 @@ The `unfold()`-step is a data formatting necessity that should not be made too p <6> If there is only one `by()`-modulator (or a series of short ones), keep it on one line, else each `by()` is a new line. <7> Back to a series `ins().outs().filters().etc()`. -Style Guide Rules -~~~~~~~~~~~~~~~~~ +=== Style Guide Rules A generalization of the specifics above are presented below. http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/traversal-component-reuse.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/traversal-component-reuse.asciidoc b/docs/src/recipes/traversal-component-reuse.asciidoc index 2b65644..3b9b408 100644 --- a/docs/src/recipes/traversal-component-reuse.asciidoc +++ b/docs/src/recipes/traversal-component-reuse.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[traversal-component-reuse]] -Traversal Component Reuse -------------------------- +== Traversal Component Reuse Good software development practices require reuse to keep software maintainable. In Gremlin, there are often bits of traversal logic that could be represented as components that might be tested independently and utilized @@ -68,4 +67,4 @@ weightFilter = { w -> outE("knows").has('weight', P.gt(w)).inV() } g.V(1).flatMap(weightFilter(0.5d)).both() g.V(1).flatMap(weightFilter(0.5d)).bothE().otherV() g.V(1).flatMap(weightFilter(0.5d)).groupCount() ----- \ No newline at end of file +---- http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/traversal-induced-values.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/traversal-induced-values.asciidoc b/docs/src/recipes/traversal-induced-values.asciidoc index d2376ef..8680600 100644 --- a/docs/src/recipes/traversal-induced-values.asciidoc +++ b/docs/src/recipes/traversal-induced-values.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[traversal-induced-values]] -Traversal Induced Values ------------------------- +== Traversal Induced Values The parameters of a `Traversal` can be known ahead of time as constants or might otherwise be passed in as dynamic arguments. @@ -173,4 +172,4 @@ Using the same example, the "weight" property on the incident edges will be used g.withSack(0).V().has("age"). sack(assign).by("age").sack(sum).by(bothE().values("weight").sum()). property("age", sack()).valueMap() ----- \ No newline at end of file +---- http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/recipes/tree.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/recipes/tree.asciidoc b/docs/src/recipes/tree.asciidoc index 48dcf1b..da74a3d 100644 --- a/docs/src/recipes/tree.asciidoc +++ b/docs/src/recipes/tree.asciidoc @@ -15,13 +15,11 @@ See the License for the specific language governing permissions and limitations under the License. //// [[tree]] -Tree ----- +== Tree image:gremlin-tree.png[width=280] -Lowest Common Ancestor -~~~~~~~~~~~~~~~~~~~~~~ +=== Lowest Common Ancestor image:tree-lca.png[width=230,float=right] Given a tree, the link:https://en.wikipedia.org/wiki/Lowest_common_ancestor[lowest common ancestor] is the deepest vertex that is common to two or more other vertices. The diagram to the right depicts the common @@ -111,8 +109,7 @@ g.withComputer(). <1> The main difference for OLAP is the use of `aggregate()` over the mid-traversal`V()`. -Maximum Depth -~~~~~~~~~~~~~ +=== Maximum Depth Finding the maximum depth of a tree starting from a specified root vertex can be determined as follows: @@ -155,8 +152,7 @@ those without incoming edges). Second, all results save the last one can be igno the one at the deepest point in the tree). In this way, the path and path length only need to be calculated for a single result. -Time-based Indexing -~~~~~~~~~~~~~~~~~~~ +=== Time-based Indexing Trees can be used for modelling time-oriented data in a graph. Modeling time where there are "year", "month" and "day" vertices (or lower granularity as needed) allows the structure of the graph to inherently index data tied to them. @@ -215,4 +211,4 @@ g.V().has('name','2016').out('may').out('day31').as('start'). <1> Find all the events in 2016. <2> Find all the events in May of 2016. <3> Find all the events on May 31, 2016. -<4> Find all the events between May 31, 2016 and June 1, 2016. \ No newline at end of file +<4> Find all the events between May 31, 2016 and June 1, 2016. http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/reference/acknowledgements.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/reference/acknowledgements.asciidoc b/docs/src/reference/acknowledgements.asciidoc index 0ce909d..4376445 100644 --- a/docs/src/reference/acknowledgements.asciidoc +++ b/docs/src/reference/acknowledgements.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[acknowledgements]] -Acknowledgements -================ += Acknowledgements image:yourkit-logo.png[width=200,float=left] YourKit supports the TinkerPop open source project with its full-featured Java Profiler. YourKit, LLC is the creator of innovative and intelligent tools for profiling Java and .NET http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/reference/conclusion.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/reference/conclusion.asciidoc b/docs/src/reference/conclusion.asciidoc index c5eecf3..ac512b8 100644 --- a/docs/src/reference/conclusion.asciidoc +++ b/docs/src/reference/conclusion.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[conclusion]] -Conclusion -========== += Conclusion image:tinkerpop-character.png[width=100,float=left] The world that we know, you and me, is but a subset of the world that Gremlin has weaved within The TinkerPop. Gremlin has constructed a fully connected graph and only the subset that http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/reference/gremlin-applications.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/reference/gremlin-applications.asciidoc b/docs/src/reference/gremlin-applications.asciidoc index dac3676..94fcd1c 100644 --- a/docs/src/reference/gremlin-applications.asciidoc +++ b/docs/src/reference/gremlin-applications.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[gremlin-applications]] -Gremlin Applications -==================== += Gremlin Applications Gremlin applications represent tools that are built on top of the core APIs to help expose common functionality to users when working with graphs. There are two key applications: @@ -70,8 +69,7 @@ WARNING: If building TinkerPop from source, be sure to clear TinkerPop-related j directory as they can become stale on some systems and not re-import properly from the local `.m2` after fresh rebuilds. [[gremlin-console]] -Gremlin Console ---------------- +== Gremlin Console image:gremlin-console.png[width=325,float=right] The Gremlin Console is an interactive terminal or link:http://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop[REPL] that can be used to traverse graphs @@ -127,8 +125,7 @@ g.V().has('name','marko').out('knows').values('name') TIP: When using Gremlin-Groovy in a Groovy class file, add `static { GremlinLoader.load() }` to the head of the file. -Console Commands -~~~~~~~~~~~~~~~~ +=== Console Commands In addition to the standard commands of the link:http://www.groovy-lang.org/Groovy+Shell[Groovy Shell], Gremlin adds some other useful operations. The following table outlines the most commonly used commands: @@ -149,8 +146,7 @@ some other useful operations. The following table outlines the most commonly us |========================================================= [[console-preferences]] -Console Preferences -~~~~~~~~~~~~~~~~~~~ +=== Console Preferences Preferences are set with `:set name value`. Values can contain spaces when quoted. All preferences are reset by `:purge preferences` @@ -197,8 +193,7 @@ Example: :set gremlin.color bg_black,green,bold ---- -Dependencies and Plugin Usage -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +=== Dependencies and Plugin Usage The Gremlin Console can dynamically load external code libraries and make them available to the user. Furthermore, those dependencies may contain Gremlin plugins which can expand the language, provide useful functions, etc. These @@ -254,8 +249,7 @@ contains the class names of the "active" plugins. It is also possible to clear deleting them from the `ext` directory. [[execution-mode]] -Execution Mode -~~~~~~~~~~~~~~ +=== Execution Mode For automated tasks and batch executions of Gremlin, it can be useful to execute Gremlin scripts in "execution" mode from the command line. Consider the following file named `gremlin.groovy`: @@ -278,8 +272,7 @@ v[2] v[3] v[4] v[5] -v[6] ----- +== v[6] It is also possible to pass arguments to scripts. Any parameters following the file name specification are treated as arguments to the script. They are collected into a list and passed in as a variable called "args". The following @@ -300,8 +293,7 @@ When executed from the command line a parameter can be supplied: $ bin/gremlin.sh -e gremlin.groovy marko v[1] $ bin/gremlin.sh -e gremlin.groovy vadas -v[2] ----- +== v[2] It is also possible to pass multiple scripts by specifying multiple `-e` options. The scripts will execute in the order that they are specified. Note that only the arguments from the last script executed will be preserved in the console. @@ -314,8 +306,7 @@ $ bin/gremlin.sh -e "gremlin.groovy -e -i --color" ---- [[interactive-mode]] -Interactive Mode -~~~~~~~~~~~~~~~~ +=== Interactive Mode The Gremlin Console can be started in an "interactive" mode. Interactive mode is like <<execution-mode, execution mode>> but the console will not exit at the completion of the script, even if the script completes unsuccessfully. In such a @@ -361,8 +352,7 @@ Like, execution mode, it is also possible to pass multiple scripts by specifying <<execution-mode, Execution Mode Section>> for more information on the specfics of that capability. [[gremlin-server]] -Gremlin Server --------------- +== Gremlin Server image:gremlin-server.png[width=400,float=right] Gremlin Server provides a way to remotely execute Gremlin scripts against one or more `Graph` instances hosted within it. The benefits of using Gremlin Server include: @@ -389,8 +379,7 @@ to the server). Developers should consider the security implications involved in appropriate precautions. Please review the <<script-execution,Script Execution Section>> for more information. [[starting-gremlin-server]] -Starting Gremlin Server -~~~~~~~~~~~~~~~~~~~~~~~ +=== Starting Gremlin Server Gremlin Server comes packaged with a script called `bin/gremlin-server.sh` to get it started (use `gremlin-server.bat` on Windows): @@ -461,8 +450,7 @@ WARNING: Transactions on graphs in initialization scripts are not closed automat executing. It is up to the script to properly commit or rollback transactions in the script itself. [[connecting-via-console]] -Connecting via Console -~~~~~~~~~~~~~~~~~~~~~~ +=== Connecting via Console With Gremlin Server running it is now possible to issue some scripts to it for processing. Start Gremlin Console as follows: @@ -569,8 +557,7 @@ have no timeout. By default, this setting uses "none". |========================================================= [[console-aliases]] -Aliases -^^^^^^^ +==== Aliases The `alias` configuration command for the Gremlin Server `:remote` can be useful in situations where there are multiple `Graph` or `TraversalSource` instances on the server, as it becomes possible to rename them from the client @@ -584,8 +571,7 @@ for purposes of execution within the context of a script. Therefore, it becomes ---- [[console-sessions]] -Sessions -^^^^^^^^ +==== Sessions A `:remote` created in the following fashion will be "sessionless", meaning each script issued to the server with `:>` will be encased in a transaction and no state will be maintained from one request to the next. @@ -616,8 +602,7 @@ request will occur within the bounds of a transaction). In this way, the state o maintained, but the need to manually managed the transactional scope of the graph is no longer required. [[console-remote-console]] -Remote Console -^^^^^^^^^^^^^^ +==== Remote Console Previous examples have shown usage of the `:>` command to send scripts to Gremlin Server. The Gremlin Console also supports an additional method for doing this which can be more convenient when the intention is to exclusively @@ -647,8 +632,7 @@ NOTE: Console commands, those that begin with a colon (e.g. `:x`, `:remote`) do They are all still evaluated locally. [[connecting-via-java]] -Connecting via Java -~~~~~~~~~~~~~~~~~~~ +=== Connecting via Java [source,xml] ---- @@ -700,8 +684,7 @@ In this case, they are streamed from the server as they arrive. <5> Parameterized request are considered the most efficient way to send Gremlin to the server as they can be cached, which will boost performance and reduce resources required on the server. -Configuration -^^^^^^^^^^^^^ +==== Configuration The following table describes the various configuration options for the Gremlin Driver: @@ -741,8 +724,7 @@ The following table describes the various configuration options for the Gremlin Please see the link:http://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/driver/Cluster.Builder.html[Cluster.Builder javadoc] to get more information on these settings. -Aliases -^^^^^^^ +==== Aliases Scripts submitted to Gremlin Server automatically have the globally configured `Graph` and `TraversalSource` instances made available to them. Therefore, if Gremlin Server configures two `TraversalSource` instances called "g1" and "g2" @@ -766,8 +748,7 @@ g2Client.submit("g.V()") The above code demonstrates how the `alias` method can be used such that the script need only contain a reference to "g" and "g1" and "g2" are automatically rebound into "g" on the server-side. -Serialization -^^^^^^^^^^^^^ +==== Serialization When using Gryo serialization (the default serializer for the driver), it is important that the client and server have the same serializers configured or else one or the other will experience serialization exceptions and fail to @@ -788,8 +769,7 @@ what classes (from Titan in this case) to auto-register during serialization. G approach when it configures it's serializers, so using this same model will ensure compatibility when making requests. [[connecting-via-python]] -Connecting via Python -~~~~~~~~~~~~~~~~~~~~~ +=== Connecting via Python [source,python] ---- @@ -843,8 +823,7 @@ returns a `concurrent.futures.Future` that resolves to a list when it is complet <9> Verify that the all results have been read and stream is closed. <10> Close client and underlying pool connections. -Configuration -^^^^^^^^^^^^^ +==== Configuration The following table describes the various configuration options for the Gremlin-Python Driver. They can be passed to the `Client` instance as keyword arguments: @@ -861,8 +840,7 @@ can be passed to the `Client` instance as keyword arguments: |username |The username to submit on requests that require authentication. |"" |========================================================= -Connecting via REST -~~~~~~~~~~~~~~~~~~~ +=== Connecting via REST image:gremlin-rexster.png[width=225,float=left] While the default behavior for Gremlin Server is to provide a WebSocket-based connection, it can also be configured to support link:http://en.wikipedia.org/wiki/Representational_state_transfer[REST]. @@ -947,8 +925,7 @@ quite possible that such a script will generate `OutOfMemoryError` exceptions on WebSocket configuration, which supports streaming, if that type of use case is required. [[connecting-via-remotegraph]] -Connecting via withRemote -~~~~~~~~~~~~~~~~~~~~~~~~~ +=== Connecting via withRemote [source,xml] ---- @@ -1056,8 +1033,7 @@ cluster.close() Both traversals are abstractly defined as `g.V(id).out('created').values('name')` and thus, the first submission can be cached for faster evaluation on the next submission. -Configuring -~~~~~~~~~~~ +=== Configuring As mentioned earlier, Gremlin Server is configured though a YAML file. By default, Gremlin Server will look for a file called `conf/gremlin-server.yaml` to configure itself on startup. To override this default, supply the file @@ -1154,8 +1130,7 @@ link:http://repo1.maven.org/maven2/org/acplt/oncrpc/1.0.7/[here] and copy it to before starting the server. [[opprocessor-configurations]] -OpProcessor Configurations -^^^^^^^^^^^^^^^^^^^^^^^^^^ +==== OpProcessor Configurations An `OpProcessor` provides a way to plug-in handlers to Gremlin Server's processing flow. Gremlin Server uses this plug-in system itself to expose the packaged functionality that it exposes. Configurations can be supplied to an @@ -1171,8 +1146,7 @@ processors: The following sub-sections describe those configurations for each `OpProcessor` implementations supplied with Gremlin Server. -SessionOpProcessor -++++++++++++++++++ +===== SessionOpProcessor The `SessionOpProcessor` provides a way to interact with Gremlin Server over a <<sessions,session>>. @@ -1197,8 +1171,7 @@ method for processing script evaluation requests. |========================================================= [[traversalopprocessor]] -TraversalOpProcessor -++++++++++++++++++++ +===== TraversalOpProcessor The `TraversalOpProcessor` provides a way to use <<connecting-via-remotegraph,RemoteGraph>>. @@ -1209,8 +1182,7 @@ The `TraversalOpProcessor` provides a way to use <<connecting-via-remotegraph,Re |cacheMaxSize |The maximum number of entries in the side-effect cache. |1000 |========================================================= -Security and Execution -^^^^^^^^^^^^^^^^^^^^^^ +==== Security and Execution image:gremlin-server-secure.png[width=175,float=right] Gremlin Server provides for several features that aid in the security of the graphs that it exposes. It has built in SSL support and a pluggable authentication framework using @@ -1230,8 +1202,7 @@ authentication: { config: { credentialsDb: conf/tinkergraph-credentials.properties}} -Quick Start -+++++++++++ +===== Quick Start A quick way to get started with the `SimpleAuthenticator` is to use TinkerGraph for the "credentials graph" and the "sample" credential graph that is packaged with the server. @@ -1292,8 +1263,7 @@ Once the server has started, issue a request passing the credentials with an `Au curl -X POST --insecure -u stephen:password -d "{\"gremlin\":\"100-1\"}" "https://localhost:8182" [[credentials-dsl]] -Credentials Graph DSL -+++++++++++++++++++++ +===== Credentials Graph DSL The "credentials graph", which has been mentioned in previous sections, is used by Gremlin Server to hold the list of users who can authenticate to the server. It is possible to use virtually any `Graph` instance for this task as long @@ -1331,8 +1301,7 @@ credentials.countUsers() ---- [[script-execution]] -Script Execution -++++++++++++++++ +===== Script Execution It is important to remember that Gremlin Server exposes a `ScriptEngine` instance that allows for remote execution of arbitrary code on the server. Obviously, this situation can represent a security risk or, more minimally, provide @@ -1504,8 +1473,7 @@ A final thought on the topic of `CompilerCustomizerProvider` implementations is can fine tune the Groovy compilation process. Read more about compilation customization in the link:http://docs.groovy-lang.org/latest/html/documentation/#compilation-customizers[Groovy Documentation]. -Serialization -^^^^^^^^^^^^^ +==== Serialization Gremlin Server can accept requests and return results using different serialization formats. Serializers implement the `MessageSerializer` interface. In doing so, they express the list of mime types they expect to support. When @@ -1520,8 +1488,7 @@ some serializers have additional configuration options as defined by the `serial available and/or expected keys are dependent on the serializer being used. Gremlin Server comes packaged with two different serializers: GraphSON and Gryo. -GraphSON -++++++++ +===== GraphSON The GraphSON serializer produces human readable output in JSON format and is a good configuration choice for those trying to use TinkerPop from non-JVM languages. JSON obviously has wide support across virtually all major @@ -1556,8 +1523,7 @@ type names, so interpretation from non-JVM languages will be required. It has t |ioRegistries |A list of `IoRegistry` implementations to be applied to the serializer. |_none_ |========================================================= -Gryo -++++ +===== Gryo The Gryo serializer utilizes Kryo-based serialization which produces a binary output. This format is best consumed by JVM-based languages. @@ -1587,8 +1553,7 @@ important to use cases where server types need to be coerced to client types (i. but not on the client). Implementations should typically instantiate `ClassResolver` implementations that are extensions of the `GryoClassResolver` as this class is important to most serialization tasks in TinkerPop. -Metrics -^^^^^^^ +==== Metrics Gremlin Server produces metrics about its operations that can yield some insight into how it is performing. These metrics are exposed in a variety of ways: @@ -1618,13 +1583,11 @@ session-based requests where "engine-name" will be the actual name of the engine * `engine-name.sessionless.*` - metrics related to different `GremlinScriptEngine` instances configured for sessionless requests where "engine-name" will be the actual name of the engine, such as "gremlin-groovy". -Best Practices -~~~~~~~~~~~~~~ +=== Best Practices The following sections define best practices for working with Gremlin Server. -Tuning -^^^^^^ +==== Tuning image:gremlin-handdrawn.png[width=120,float=right] Tuning Gremlin Server for a particular environment may require some simple trial-and-error, but the following represent some basic guidelines that might be useful: @@ -1654,8 +1617,7 @@ the query as `g.V(1).valueMap(true)` than `g.V(1)`, as the former returns a `Map without all the associated structure which can slow the response. [[parameterized-scripts]] -Parameterized Scripts -^^^^^^^^^^^^^^^^^^^^^ +==== Parameterized Scripts image:gremlin-parameterized.png[width=150,float=left] Use script parameterization. Period. Gremlin Server caches all scripts that are passed to it. The cache is keyed based on the a hash of the script. Therefore `g.V(1)` and @@ -1679,8 +1641,7 @@ section. It controls the maximum number of parameters that can be passed to the Use of this setting can prevent accidental long run compilations, which individually are not terribly oppressive to the server, but taken as a group under high concurrency would be considered detrimental. -Cache Management -^^^^^^^^^^^^^^^^ +==== Cache Management If Gremlin Server processes a large number of unique scripts, the global function cache will grow beyond the memory available to Gremlin Server and an `OutOfMemoryError` will loom. Script parameterization goes a long way to solving @@ -1712,8 +1673,7 @@ client.submit("[1,2,3,x]", params); ---- [[sessions]] -Considering Sessions -^^^^^^^^^^^^^^^^^^^^ +==== Considering Sessions The preferred approach for issuing requests to Gremlin Server is to do so in a sessionless manner. The concept of "sessionless" refers to a request that is completely encapsulated within a single transaction, such that the script @@ -1792,8 +1752,7 @@ A session is a "heavier" approach to the simple "request/response" approach of s necessary for a given use case. [[considering-transactions]] -Considering Transactions -^^^^^^^^^^^^^^^^^^^^^^^^ +==== Considering Transactions Gremlin Server performs automated transaction handling for "sessionless" requests (i.e. no state between requests) and for "in-session" requests with that feature enabled. It will automatically commit or rollback transactions depending @@ -1807,8 +1766,7 @@ Gremlin Server will only close transactions on the graphs specified by the `alia will simply have Gremlin Server close transactions on all graphs for every request. [[considering-state]] -Considering State -^^^^^^^^^^^^^^^^^ +==== Considering State With REST and any sessionless requests, there is no variable state maintained between requests. Therefore, when <<connecting-via-console,connecting with the console>>, for example, it is not possible to create a variable in @@ -1836,8 +1794,7 @@ All functions created via scripts are global to the server. gremlin> :> def subtractIt(int x, int y) { x - y } ==>null gremlin> :> subtractIt(8,7) -==>1 ----- +== ==>1 If this behavior is not desirable there are several options. A first option would be to consider using sessions. Each session gets its own `ScriptEngine`, which maintains its own isolated cache of global functions, whereas sessionless @@ -1868,8 +1825,7 @@ In the above REST-based requests, the bindings contain a special parameter that immediately forget the script after execution. In this way, the function does not end up being globally available. [[gremlin-plugins]] -Gremlin Plugins ---------------- +== Gremlin Plugins image:gremlin-plugin.png[width=125] @@ -1879,8 +1835,7 @@ link:http://tinkerpop.apache.org/docs/x.y.z/dev/provider/#gremlin-plugins[Provid how to develop custom plugins. [[credentials-plugin]] -Credentials Plugin -~~~~~~~~~~~~~~~~~~ +=== Credentials Plugin image:gremlin-server.png[width=200,float=left] xref:gremlin-server[Gremlin Server] supports an authentication model where user credentials are stored inside of a `Graph` instance. This database can be managed with the @@ -1894,8 +1849,7 @@ gremlin> :plugin use tinkerpop.credentials This plugin imports the appropriate classes for managing the credentials graph. [[gephi-plugin]] -Gephi Plugin -~~~~~~~~~~~~ +=== Gephi Plugin image:gephi-logo.png[width=200, float=left] link:http://gephi.org/[Gephi] is an interactive visualization, exploration, and analysis platform for graphs. The link:https://gephi.org/plugins/#/plugin/graphstreaming[Graph Streaming] @@ -2011,8 +1965,7 @@ Gephi plugin configuration parameters as accepted via the `:remote config` comma |========================================================= [[server-plugin]] -Server Plugin -~~~~~~~~~~~~~ +=== Server Plugin image:gremlin-server.png[width=200,float=left] xref:gremlin-server[Gremlin Server] remotely executes Gremlin scripts that are submitted to it. The Server Plugin provides a way to submit scripts to Gremlin Server for remote @@ -2021,8 +1974,7 @@ processing. Read more about the plugin and how it works in the Gremlin Server s NOTE: The Server Plugin is enabled in the Gremlin Console by default. [[sugar-plugin]] -Sugar Plugin -~~~~~~~~~~~~ +=== Sugar Plugin image:gremlin-sugar.png[width=120,float=left] In previous versions of Gremlin-Groovy, there were numerous link:http://en.wikipedia.org/wiki/Syntactic_sugar[syntactic sugars] that users could rely on to make their traversals @@ -2044,8 +1996,7 @@ gremlin> :plugin use tinkerpop.sugar TIP: When using Sugar in a Groovy class file, add `static { SugarLoader.load() }` to the head of the file. Note that `SugarLoader.load()` will automatically call `GremlinLoader.load()`. -Graph Traversal Methods -^^^^^^^^^^^^^^^^^^^^^^^ +==== Graph Traversal Methods If a `GraphTraversal` property is unknown and there is a corresponding method with said name off of `GraphTraversal` then the property is assumed to be a method call. This enables the user to omit `( )` from the method name. However, @@ -2062,8 +2013,7 @@ g.V.outE.weight <3> <2> The traversal is interpreted as `g.V().values('name')`. <3> A chain of zero-argument step calls with a property value call. -Range Queries -^^^^^^^^^^^^^ +==== Range Queries The `[x]` and `[x..y]` range operators in Groovy translate to `RangeStep` calls. @@ -2074,8 +2024,7 @@ g.V[0..<2] g.V[2] ---- -Logical Operators -^^^^^^^^^^^^^^^^^ +==== Logical Operators The `&` and `|` operator are overloaded in `SugarGremlinPlugin`. When used, they introduce the `AndStep` and `OrStep` markers into the traversal. See <<and-step,`and()`>> and <<or-step,`or()`>> for more information. @@ -2092,8 +2041,7 @@ t.toString() <1> Introducing the `AndStep` with the `&` operator. <2> Introducing the `OrStep` with the `|` operator. -Traverser Methods -^^^^^^^^^^^^^^^^^ +==== Traverser Methods It is rare that a user will ever interact with a `Traverser` directly. However, if they do, some method redirects exist to make it easy. @@ -2105,16 +2053,14 @@ g.V.map{it.name} // sugar ---- [[utilities-plugin]] -Utilities Plugin -~~~~~~~~~~~~~~~~ +=== Utilities Plugin The Utilities Plugin provides various functions, helper methods and imports of external classes that are useful in the console. NOTE: The Utilities Plugin is enabled in the Gremlin Console by default. [[benchmarking-and-profiling]] -Benchmarking and Profiling -^^^^^^^^^^^^^^^^^^^^^^^^^^ +==== Benchmarking and Profiling The link:https://code.google.com/p/gperfutils/[GPerfUtils] library provides a number of performance utilities for Groovy. Specifically, these tools cover benchmarking and profiling. @@ -2135,8 +2081,7 @@ profile { g.V().iterate() }.prettyPrint() ---- [[describe-graph]] -Describe Graph -^^^^^^^^^^^^^^ +==== Describe Graph A good implementation of the Gremlin APIs will validate their features against the xref:validating-with-gremlin-test[Gremlin test suite]. To learn more about a specific implementation's compliance with the test suite, use the `describeGraph` function. @@ -2148,8 +2093,7 @@ describeGraph(HadoopGraph) ---- [[gremlin-archetypes]] -Gremlin Archetypes ------------------- +== Gremlin Archetypes TinkerPop has a number of link:https://maven.apache.org/guides/introduction/introduction-to-archetypes.html[Maven archetypes], which provide example project templates to quickly get started with TinkerPop. The available archetypes are as follows: http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/reference/gremlin-variants.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/reference/gremlin-variants.asciidoc b/docs/src/reference/gremlin-variants.asciidoc index 85fd1aa..3076c14 100644 --- a/docs/src/reference/gremlin-variants.asciidoc +++ b/docs/src/reference/gremlin-variants.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[gremlin-variants]] -Gremlin Variants -================ += Gremlin Variants image::gremlin-house-of-mirrors.png[width=1024] @@ -38,15 +37,13 @@ please review the link:http://tinkerpop.apache.org/docs/current/tutorials/gremli tutorial. [[gremlin-java]] -Gremlin-Java ------------- +== Gremlin-Java image:gremlin-java-drawing.png[width=130,float=right] Apache TinkerPop's Gremlin-Java implements Gremlin within the Java8 language and can be used by any Java8 compliant virtual machine. Gremlin-Java is considered the canonical, reference implementation of Gremlin and serves as the foundation by which all other Gremlin language variants should emulate. -The Lambda Solution -~~~~~~~~~~~~~~~~~~~ +=== The Lambda Solution Supporting link:https://en.wikipedia.org/wiki/Anonymous_function[anonymous functions] across languages is difficult as most language do not support lambda introspection and thus, code analysis. In Gremlin-Java, Java8 lambdas can be leveraged. @@ -70,8 +67,7 @@ g.V().out("knows").sideEffect(Lambda.consumer("println it")) g.V().as("a").out("knows").as("b").select("b").by(Lambda.<Vertex,Integer>function("it.value('name').length()")) [[gremlin-groovy]] -Gremlin-Groovy --------------- +== Gremlin-Groovy image:gremlin-groovy-drawing.png[width=130,float=right] Apache TinkerPop's Gremlin-Groovy implements Gremlin within the link:http://groovy.apache.org[Apache Groovy] language. As a JVM-based language variant, Gremlin-Groovy is backed by @@ -83,8 +79,7 @@ statically from the anonymous traversal `__` and therefore, must always be prefi `g.V().as('a').in().as('b').where(__.not(__.as('a').out().as('b')))` [[gremlin-python]] -Gremlin-Python --------------- +== Gremlin-Python image:gremlin-python-drawing.png[width=130,float=right] Apache TinkerPop's Gremlin-Python implements Gremlin within the link:https://www.python.org/[Python] language and can be used on any Python virtual machine including the popular @@ -179,8 +174,7 @@ Likewise, if it has lambdas represented in Python, it will use Gremlin-Python (e IMPORTANT: Gremlin-Python's `Traversal` class supports the standard Gremlin methods such as `next()`, `nextTraverser()`, `toSet()`, `toList()`, etc. Such "terminal" methods trigger the evaluation of the traversal. -RemoteConnection Submission -~~~~~~~~~~~~~~~~~~~~~~~~~~~ +=== RemoteConnection Submission There are various ways to submit a traversal to a `RemoteConnection`. Just as in Gremlin-Java, there are various "terminal/action methods" off of `Traversal`. @@ -191,8 +185,7 @@ There are various ways to submit a traversal to a `RemoteConnection`. Just as in * `Traversal.toSet()` * `Traversal.iterate()` -Gremlin-Python Sugar -~~~~~~~~~~~~~~~~~~~~ +=== Gremlin-Python Sugar Python supports meta-programming and operator overloading. There are three uses of these techniques in Gremlin-Python that makes traversals a bit more concise. @@ -204,8 +197,7 @@ g.V().both()[1].toList() g.V().both().name.toList() ---- -Static Enums and Methods -~~~~~~~~~~~~~~~~~~~~~~~~ +=== Static Enums and Methods Gremlin has various tokens (e.g. `T`, `P`, `Order`, `Operator`, etc.) that are represented in Gremlin-Python as Python `Enums`. @@ -248,8 +240,7 @@ That is, without the `__.`-prefix. g.V().repeat(out()).times(2).name.fold().toList() ---- -Bindings -~~~~~~~~ +=== Bindings When a traversal bytecode is sent over a `RemoteConnection` (e.g. Gremlin Server), it will be translated, compiled, and then executed. If the same traversal is sent again, translation and compilation can be skipped as the previously compiled version should be cached. @@ -264,8 +255,7 @@ g.V(('id',1)).out('created').name.toList() g.V(('id',4)).out('created').name.toList() ---- -Traversal Strategies -~~~~~~~~~~~~~~~~~~~~ +=== Traversal Strategies In order to add and remove <<traversalstrategy,traversal strategies>> from a traversal source, Gremlin-Python has a `TraversalStrategy` class along with a collection of subclasses that mirror the standard Gremlin-Java strategies. @@ -288,8 +278,7 @@ Apache TinkerPop's JVM-based Gremlin traversal machine. As such, their `apply(Tr the strategy is encoded in the Gremlin-Python bytecode and transmitted to the Gremlin traversal machine for re-construction machine-side. -The Lambda Solution -~~~~~~~~~~~~~~~~~~~ +=== The Lambda Solution Supporting link:https://en.wikipedia.org/wiki/Anonymous_function[anonymous functions] across languages is difficult as most language do not support lambda introspection and thus, code analysis. In Gremlin-Python, @@ -319,8 +308,7 @@ g.V().out().map(lambda: "x: len(x.get().value('name'))").sum().toList() <7> The default lambda language is changed back to Gremlin-Python. <8> If the `lambda`-prefix is not provided, then it is appended automatically in order to give a more natural look to the expression. -Custom Serialization -~~~~~~~~~~~~~~~~~~~~ +=== Custom Serialization Gremlin-Python provides a GraphSON 2.0 serialization package with the standard Apache TinkerPop `g`-types registered (see link:http://tinkerpop.apache.org/docs/x.y.z/dev/io/#graphson-2d0[GraphSON 2.0]). It is possible for users to add @@ -356,8 +344,7 @@ connection = DriverRemoteConnection('ws://localhost:8182/gremlin', 'g', ---- [[gremlin-DotNet]] -Gremlin.Net ------------ +== Gremlin.Net WARNING: Gremlin.Net does not yet have an official release. It is for developers who want to experiment with TinkerPop in the .NET ecosystem. @@ -400,8 +387,7 @@ location (e.g. Gremlin Server). IMPORTANT: Gremlin-DotNetâs `ITraversal` interface supports the standard Gremlin methods such as `Next()`, `NextTraverser()`, `ToSet()`, `ToList()`, etc. Such "terminal" methods trigger the evaluation of the traversal. -RemoteConnection Submission -~~~~~~~~~~~~~~~~~~~~~~~~~~~ +=== RemoteConnection Submission Very similar to Gremlin-Python and Gremlin-Java, there are various ways to submit a traversal to a `IRemoteConnection` using terminal/action methods off of `ITraversal`. @@ -412,8 +398,7 @@ terminal/action methods off of `ITraversal`. * `ITraversal.ToSet()` * `ITraversal.Iterate()` -Static Enums and Methods -~~~~~~~~~~~~~~~~~~~~~~~~ +=== Static Enums and Methods Gremlin has various tokens (e.g. `T`, `P`, `Order`, `Operator`, etc.) that are represented in Gremlin-DotNet as Enums. @@ -440,8 +425,7 @@ Finally, with using static `__`, anonymous traversals like `__.Out()` can be exp [source,csharp] g.V().Repeat(Out()).Times(2).Values("name").Fold().ToList() -Bindings -~~~~~~~~ +=== Bindings When a traversal bytecode is sent over a `IRemoteConnection` (e.g. Gremlin Server), it will be translated, compiled, and then executed. If the same traversal is sent again, translation and compilation can be skipped as the previously @@ -458,8 +442,7 @@ g.V(b.Of("id", 1)).Out("created").Values("name").toList() g.V(b.Of("id", 4)).Out("created").Values("name").toList() ---- -Traversal Strategies -~~~~~~~~~~~~~~~~~~~~ +=== Traversal Strategies In order to add and remove traversal strategies from a traversal source, Gremlin-DotNet has an `AbstractTraversalStrategy` class along with a collection of subclasses that mirror the standard Gremlin-Java strategies. @@ -488,8 +471,7 @@ NOTE: Many of the TraversalStrategy classes in Gremlin-DotNet are proxies to the JVM-based Gremlin traversal machine. As such, their `Apply(ITraversal)` method does nothing. However, the strategy is encoded in the Gremlin-DotNet bytecode and transmitted to the Gremlin traversal machine for re-construction machine-side. -Custom Serialization -~~~~~~~~~~~~~~~~~~~~ +=== Custom Serialization Gremlin-DotNet provides a GraphSON 2.0 serialization package with the standard Apache TinkerPop `g`-types registered (see link:http://tinkerpop.apache.org/docs/x.y.z/dev/io/#graphson-2d0[GraphSON 2.0]). It is possible for users to add new @@ -543,4 +525,4 @@ var graphsonWriter = new GraphSONWriter( new Dictionary<Type, IGraphSONSerializer> {{typeof(MyType), new MyClassWriter()}}); var gremlinClient = new GremlinClient(new GremlinServer("localhost", 8182), graphsonReader, graphsonWriter); ----- \ No newline at end of file +---- http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/reference/implementations-giraph.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/reference/implementations-giraph.asciidoc b/docs/src/reference/implementations-giraph.asciidoc index 5aaac7c..f83903d 100644 --- a/docs/src/reference/implementations-giraph.asciidoc +++ b/docs/src/reference/implementations-giraph.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[giraphgraphcomputer]] -GiraphGraphComputer -^^^^^^^^^^^^^^^^^^^ +==== GiraphGraphComputer [source,xml] ---- @@ -84,8 +83,7 @@ then these values will be used by Giraph. However, if these are not specified an `GraphComputer.workers()` then `GiraphGraphComputer` will try to compute the number of workers/threads to use based on the cluster's profile. -Loading with BulkLoaderVertexProgram -++++++++++++++++++++++++++++++++++++ +===== Loading with BulkLoaderVertexProgram The <<bulkloadervertexprogram, BulkLoaderVertexProgram>> is a generalized bulk loader that can be used to load large amounts of data to and from different `Graph` implementations. The following code demonstrates how to load @@ -144,4 +142,4 @@ gremlin.tinkergraph.graphFormat=gryo gremlin.tinkergraph.graphLocation=/tmp/tinkergraph.kryo ---- -NOTE: The path to TinkerGraph needs to be included in the `HADOOP_GREMLIN_LIBS` for the above example to work. \ No newline at end of file +NOTE: The path to TinkerGraph needs to be included in the `HADOOP_GREMLIN_LIBS` for the above example to work. http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/reference/implementations-hadoop-end.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/reference/implementations-hadoop-end.asciidoc b/docs/src/reference/implementations-hadoop-end.asciidoc index 3fe768d..0650c9c 100644 --- a/docs/src/reference/implementations-hadoop-end.asciidoc +++ b/docs/src/reference/implementations-hadoop-end.asciidoc @@ -14,8 +14,7 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. //// -Input/Output Formats -~~~~~~~~~~~~~~~~~~~~ +=== Input/Output Formats image:adjacency-list.png[width=300,float=right] Hadoop-Gremlin provides various I/O formats -- i.e. Hadoop `InputFormat` and `OutputFormat`. All of the formats make use of an link:http://en.wikipedia.org/wiki/Adjacency_list[adjacency list] @@ -25,8 +24,7 @@ outgoing edges. {empty} + [[gryo-io-format]] -Gryo I/O Format -^^^^^^^^^^^^^^^ +==== Gryo I/O Format * **InputFormat**: `org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat` * **OutputFormat**: `org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat` @@ -38,8 +36,7 @@ savings over text-based representations. NOTE: The `GryoInputFormat` is splittable. [[graphson-io-format]] -GraphSON I/O Format -^^^^^^^^^^^^^^^^^^^ +==== GraphSON I/O Format * **InputFormat**: `org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONInputFormat` * **OutputFormat**: `org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONOutputFormat` @@ -61,8 +58,7 @@ The data below represents an adjacency list representation of the classic Tinker ---- [[script-io-format]] -Script I/O Format -^^^^^^^^^^^^^^^^^ +==== Script I/O Format * **InputFormat**: `org.apache.tinkerpop.gremlin.hadoop.structure.io.script.ScriptInputFormat` * **OutputFormat**: `org.apache.tinkerpop.gremlin.hadoop.structure.io.script.ScriptOutputFormat` @@ -71,8 +67,7 @@ Script I/O Format `Vertex` objects, respectively. This can be considered the most general `InputFormat`/`OutputFormat` possible in that Hadoop-Gremlin uses the user provided script for all reading/writing. -ScriptInputFormat -+++++++++++++++++ +===== ScriptInputFormat The data below represents an adjacency list representation of the classic TinkerGraph toy graph. First line reads, "vertex `1`, labeled `person` having 2 property values (`marko` and `29`) has 3 outgoing edges; the first edge is @@ -124,8 +119,7 @@ def parse(line, factory) { The resultant `Vertex` denotes whether the line parsed yielded a valid Vertex. As such, if the line is not valid (e.g. a comment line, a skip line, etc.), then simply return `null`. -ScriptOutputFormat Support -++++++++++++++++++++++++++ +===== ScriptOutputFormat Support The principle above can also be used to convert a vertex to an arbitrary `String` representation that is ultimately streamed back to a file in HDFS. This is the role of `ScriptOutputFormat`. `ScriptOutputFormat` requires that the @@ -148,8 +142,7 @@ def stringify(vertex) { -Storage Systems -~~~~~~~~~~~~~~~ +=== Storage Systems Hadoop-Gremlin provides two implementations of the `Storage` API: @@ -157,8 +150,7 @@ Hadoop-Gremlin provides two implementations of the `Storage` API: * `SparkContextStorage`: Access Spark persisted RDD data. [[interacting-with-hdfs]] -Interacting with HDFS -^^^^^^^^^^^^^^^^^^^^^ +==== Interacting with HDFS The distributed file system of Hadoop is called link:http://en.wikipedia.org/wiki/Apache_Hadoop#Hadoop_distributed_file_system[HDFS]. The results of any OLAP operation are stored in HDFS accessible via `hdfs`. For local file system access, there is `fs`. @@ -176,8 +168,7 @@ hdfs.ls() ---- [[interacting-with-spark]] -Interacting with Spark -^^^^^^^^^^^^^^^^^^^^^^ +==== Interacting with Spark If a Spark context is persisted, then Spark RDDs will remain the Spark cache and accessible over subsequent jobs. RDDs are retrieved and saved to the `SparkContext` via `PersistedInputRDD` and `PersistedOutputRDD` respectivly. @@ -198,8 +189,7 @@ spark.rm('output') spark.ls() ---- -A Command Line Example -~~~~~~~~~~~~~~~~~~~~~~ +=== A Command Line Example image::pagerank-logo.png[width=300] @@ -337,4 +327,4 @@ Vertex 4 ("josh") is isolated below: "age":[{"id":7,"value":32}]} } } ----- \ No newline at end of file +---- http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/reference/implementations-hadoop-start.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/reference/implementations-hadoop-start.asciidoc b/docs/src/reference/implementations-hadoop-start.asciidoc index 028b1c1..31ecf6b 100644 --- a/docs/src/reference/implementations-hadoop-start.asciidoc +++ b/docs/src/reference/implementations-hadoop-start.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[hadoop-gremlin]] -Hadoop-Gremlin --------------- +== Hadoop-Gremlin [source,xml] ---- @@ -39,8 +38,7 @@ tutorial. Moreover, if using `GiraphGraphComputer` or `SparkGraphComputer` it is familiarize their self with Giraph (link:http://giraph.apache.org/quick_start.html[Getting Started]) and Spark (link:http://spark.apache.org/docs/latest/quick-start.html[Quick Start]). -Installing Hadoop-Gremlin -~~~~~~~~~~~~~~~~~~~~~~~~~ +=== Installing Hadoop-Gremlin If using <<gremlin-console,Gremlin Console>>, it is important to install the Hadoop-Gremlin plugin. Note that Hadoop-Gremlin requires a Gremlin Console restart after installing. @@ -95,8 +93,7 @@ directory). [source,shell] export HADOOP_GREMLIN_LIBS=/usr/local/gremlin-console/ext/giraph-gremlin/lib -Properties Files -~~~~~~~~~~~~~~~~ +=== Properties Files `HadoopGraph` makes use of properties files which ultimately get turned into Apache configurations and/or Hadoop configurations. @@ -152,8 +149,7 @@ underlying OLAP engine (e.g. Spark, Giraph, etc.) works and understand the numer these systems. Such knowledge can help alleviate out of memory exceptions, slow load times, slow processing times, garbage collection issues, etc. -OLTP Hadoop-Gremlin -~~~~~~~~~~~~~~~~~~~ +=== OLTP Hadoop-Gremlin image:hadoop-pipes.png[width=180,float=left] It is possible to execute OLTP operations over a `HadoopGraph`. However, realize that the underlying HDFS files are not random access and thus, to retrieve a vertex, a linear scan @@ -175,8 +171,7 @@ g.V().out().out().values('name') g.V().group().by{it.value('name')[1]}.by('name').next() ---- -OLAP Hadoop-Gremlin -~~~~~~~~~~~~~~~~~~~ +=== OLAP Hadoop-Gremlin image:hadoop-furnace.png[width=180,float=left] Hadoop-Gremlin was designed to execute OLAP operations via `GraphComputer`. The OLTP examples presented previously are reproduced below, but using `TraversalVertexProgram` @@ -233,4 +228,4 @@ gremlin> :plugin use tinkerpop.spark WARNING: Hadoop, Spark, and Giraph all depend on many of the same libraries (e.g. ZooKeeper, Snappy, Netty, Guava, etc.). Unfortunately, typically these dependencies are not to the same versions of the respective libraries. As such, it is best to *not* have both Spark and Giraph plugins loaded in the same console session nor in the same Java -project (though intelligent `<exclusion>`-usage can help alleviate conflicts in a Java project). \ No newline at end of file +project (though intelligent `<exclusion>`-usage can help alleviate conflicts in a Java project). http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/reference/implementations-intro.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/reference/implementations-intro.asciidoc b/docs/src/reference/implementations-intro.asciidoc index f1992e5..506c61e 100644 --- a/docs/src/reference/implementations-intro.asciidoc +++ b/docs/src/reference/implementations-intro.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[implementations]] -Implementations -=============== += Implementations image::gremlin-racecar.png[width=325] @@ -24,4 +23,4 @@ TinkerPop offers several reference implementations of its interfaces that are no but also represent models by which different graph providers can build their systems. More specific documentation on how to build systems at this level of the API can be found in the link:http://tinkerpop.apache.org/docs/x.y.z/dev/provider/[Provider Documentation]. The following sections -describe the various reference implementations and their usage. \ No newline at end of file +describe the various reference implementations and their usage. http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/reference/implementations-neo4j.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/reference/implementations-neo4j.asciidoc b/docs/src/reference/implementations-neo4j.asciidoc index bb16bec..be7371f 100644 --- a/docs/src/reference/implementations-neo4j.asciidoc +++ b/docs/src/reference/implementations-neo4j.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[neo4j-gremlin]] -Neo4j-Gremlin -------------- +== Neo4j-Gremlin [source,xml] ---- @@ -60,8 +59,7 @@ TIP: To host Neo4j in <<gremlin-server,Gremlin Server>>, the dependencies must f copied to the Gremlin Server path. The automated method for doing this would be to execute `bin/gremlin-server.sh -i org.apache.tinkerpop neo4j-gremlin x.y.z`. -Indices -~~~~~~~ +=== Indices Neo4j 2.x indices leverage vertex labels to partition the index space. TinkerPop3 does not provide method interfaces for defining schemas/indices for the underlying graph system. Thus, in order to create indices, it is important to @@ -115,8 +113,7 @@ graph.close() <5> Find all vertices whose name is Garcia which requires a linear scan of all the data in the graph. <6> Drop the created index. -Multi/Meta-Properties -~~~~~~~~~~~~~~~~~~~~~ +=== Multi/Meta-Properties `Neo4jGraph` supports both multi- and meta-properties (see <<vertex-properties,vertex properties>>). These features are not native to Neo4j and are implemented using "hidden" Neo4j nodes. For example, when a vertex has multiple @@ -153,8 +150,7 @@ nodes" adjacent to the vertex. If a vertex property key/value is required for in required -- e.g. `CREATE INDEX ON :person(name)` and `CREATE INDEX ON :vertexProperty(name)` (see <<_indices,Neo4j indices>>). -Cypher -~~~~~~ +=== Cypher image::gremlin-loves-cypher.png[width=400] @@ -176,8 +172,7 @@ back into imperative Gremlin. TIP: For those developers using <<gremlin-server,Gremlin Server>> against Neo4j, it is possible to do Cypher queries by simply placing the Cypher string in `graph.cypher(...)` before submission to the server. -Multi-Label -~~~~~~~~~~~ +=== Multi-Label TinkerPop3 requires every `Element` to have a single, immutable string label (i.e. a `Vertex`, `Edge`, and `VertexProperty`). In Neo4j, a `Node` (vertex) can have an @@ -228,8 +223,7 @@ IMPORTANT: `LabelP.of()` is only required if multi-labels are leveraged. `LabelP filtering/looking-up vertices by their label(s) as the standard `P.eq()` does a direct match on the `::`-representation of `vertex.label()` -Loading with BulkLoaderVertexProgram -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +=== Loading with BulkLoaderVertexProgram The <<bulkloadervertexprogram, BulkLoaderVertexProgram>> is a generalized bulk loader that can be used to load large amounts of data to and from Neo4j. The following code demonstrates how to load the modern graph from TinkerGraph @@ -259,8 +253,7 @@ gremlin.neo4j.conf.node_auto_indexing=true gremlin.neo4j.conf.relationship_auto_indexing=true ---- -High Availability Configuration -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +=== High Availability Configuration image:neo4j-ha.png[width=400,float=right] TinkerPop supports running Neo4j with its fault tolerant master-slave replication configuration, referred to as its link:http://neo4j.com/docs/operations-manual/current/#_neo4j_cluster_install[High Availability (HA) cluster]. From the http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/reference/implementations-spark.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/reference/implementations-spark.asciidoc b/docs/src/reference/implementations-spark.asciidoc index ca91db6..c7a7f24 100644 --- a/docs/src/reference/implementations-spark.asciidoc +++ b/docs/src/reference/implementations-spark.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[sparkgraphcomputer]] -SparkGraphComputer -^^^^^^^^^^^^^^^^^^ +==== SparkGraphComputer [source,xml] ---- @@ -86,8 +85,7 @@ image::spark-algorithm.png[width=775] |gremlin.spark.persistStorageLevel |What `StorageLevel` to use when persisted RDDs via `PersistedOutputRDD` (default `MEMORY_ONLY`). |======================================================== -InputRDD and OutputRDD -++++++++++++++++++++++ +===== InputRDD and OutputRDD If the provider/user does not want to use Hadoop `InputFormats`, it is possible to leverage Spark's RDD constructs directly. An `InputRDD` provides a read method that takes a `SparkContext` and returns a graphRDD. Likewise, @@ -99,16 +97,14 @@ This can save a significant amount of time and space resources. If the `InputRDD `SparkGraphComputer` will partition the graph using a `org.apache.spark.HashPartitioner` with the number of partitions being either the number of existing partitions in the input (i.e. input splits) or the user specified number of `GraphComputer.workers()`. -Storage Levels -++++++++++++++ +===== Storage Levels The `SparkGraphComputer` uses `MEMORY_ONLY` to cache the input graph and the output graph by default. Users should be aware of the impact of different storage levels, since the default settings can quickly lead to memory issues on larger graphs. An overview of Spark's persistence settings is provided in link:http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence[Spark's programming guide]. -Using a Persisted Context -+++++++++++++++++++++++++ +===== Using a Persisted Context It is possible to persist the graph RDD between jobs within the `SparkContext` (e.g. SparkServer) by leveraging `PersistedOutputRDD`. Note that `gremlin.spark.persistContext` should be set to `true` or else the persisted RDD will be destroyed when the `SparkContext` closes. @@ -127,8 +123,7 @@ references to that Spark Context. The exception to this rule are those propertie Finally, there is a `spark` object that can be used to manage persisted RDDs (see <<interacting-with-spark, Interacting with Spark>>). [[bulkdumpervertexprogramusingspark]] -Exporting with BulkDumperVertexProgram -++++++++++++++++++++++++++++++++++++++ +===== Exporting with BulkDumperVertexProgram The <<bulkdumpervertexprogram, BulkDumperVertexProgram>> exports a whole graph in any of the supported Hadoop GraphOutputFormats (`GraphSONOutputFormat`, `GryoOutputFormat` or `ScriptOutputFormat`). The example below takes a Hadoop graph as the input (in `GryoInputFormat`) and exports it as a GraphSON file @@ -144,8 +139,7 @@ hdfs.ls('output') hdfs.head('output/~g') ---- -Loading with BulkLoaderVertexProgram -++++++++++++++++++++++++++++++++++++ +===== Loading with BulkLoaderVertexProgram The <<bulkloadervertexprogram, BulkLoaderVertexProgram>> is a generalized bulk loader that can be used to load large amounts of data to and from different `Graph` implementations. The following code demonstrates how to load the @@ -197,4 +191,4 @@ gremlin.tinkergraph.graphFormat=gryo gremlin.tinkergraph.graphLocation=/tmp/tinkergraph.kryo ---- -IMPORTANT: The path to TinkerGraph jars needs to be included in the `HADOOP_GREMLIN_LIBS` for the above example to work. \ No newline at end of file +IMPORTANT: The path to TinkerGraph jars needs to be included in the `HADOOP_GREMLIN_LIBS` for the above example to work. http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/cff12774/docs/src/reference/implementations-tinkergraph.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/reference/implementations-tinkergraph.asciidoc b/docs/src/reference/implementations-tinkergraph.asciidoc index 33c4dbf..4ebafb0 100644 --- a/docs/src/reference/implementations-tinkergraph.asciidoc +++ b/docs/src/reference/implementations-tinkergraph.asciidoc @@ -15,8 +15,7 @@ See the License for the specific language governing permissions and limitations under the License. //// [[tinkergraph-gremlin]] -TinkerGraph-Gremlin -------------------- +== TinkerGraph-Gremlin [source,xml] ---- @@ -82,8 +81,7 @@ data to the graph. NOTE: TinkerGraph is distributed with Gremlin Server and is therefore automatically available to it for configuration. -Configuration -~~~~~~~~~~~~~ +=== Configuration TinkerGraph has several settings that can be provided on creation via `Configuration` object: