Hi everyone,
We have reached a wall regarding the following problems:
1. GraphStrategies via wrappers is not working for various vendors.
2. TraversalEngine is specific to a Graph and thus, global to all
traversals spawned off that graph.
3. User defined Traversal DSLs are not easily created and are not
susceptible to OLAP processing.
As a solution, Stephen and I are bouncing around the idea of a TraversalContext:
Graph graph = GraphFactory.open(configuration);
GraphTraversalContext g = graph.traversal(GraphTraversal.of()
.engine(StandardTraversalEngine.instance())
.strategy(ReadOnlyTraversalStrategy.instance()));
g.V().out().values("name").iterate();
g.V().values("age").iterate(); // spawn as many traversals as you want off of g
In essence, we want to introduce one new level of indirection from the Graph to
the Traversal. This new level is called a "TraversalContext" (no better name
yet) and it bundles the following objects:
1. Graph (the raw data structure)
2. Traversal DSL (GraphTraversal, SocialTraversal, etc. etc.)
3. TraversalEngine (Spark, Giraph, Standard, etc.)
4. TraversalStrategies (ReadOnlyTraversalStrategy, IdTraversalStrategy,
PartitionTraversalStrategy, etc.)
You can see a working implementation of GraphTraversalContext here:
https://gist.github.com/okram/e67252705a920cd34571
What problems does this solve beyond 1,2, and 3 above?
1. Graph.Iterators.vertexIterator() -> Graph.vertices() // back to a
"Blueprints"-style API for people wanting to work with the graph object directly
2. Graph.engine() goes away -- no more ThreadLocal hack.
3. Graph.of(Traversal.class) goes away -- we will be DSL friendly with
TraversalContext.
4. Graph.V() goes away -- its all in terms of the traversal context.
GraphTraversalContext.V() exists.
One big pill that must be swallowed with this model -- Vertex.outE() doesn't
exist:
Vertex v = g.V().out().next()
String name = g.V(v).out().values("name")
Graph, Vertex, Edge, etc. no longer have Traversal methods off of them (that is
NOT DSL friendly). Therefore, everything is off of TraversalContext. This is
actually going to make DSL execution on GraphComputer extremely easy and its
going to simplify vendor strategy code a lot -- strategies are simply cached
with respects to MyGraph.class.
Anywho… its a big deal. Functionally, things don't really change. Its just a
reorganization that is going to ultimately solve 1-3 in the beginning which
need solving before we release GA.
If anyone has any thoughts/concerns with the desired change, please raise them.
Thanks,
Marko.
http://markorodriguez.com