Hello,

TinkerPop 3.2.0-SNAPSHOT has made leaps and bounds towards completely aligning 
Gremlin OLTP and Gremlin OLAP. What has got me really excited is that there is 
such a strong conceptual alignment between the following components:

        VertexProgram <=> Traversal
        Iteration <=> Step
        Messages <=> Traversers
        MessageCombiner <=> TraverserSet ("bulking")
        BSP <=> Barrier
        Workers <=> Parallel Steps
        Master <=> Sequential Steps
        Memory <=> SideEffects

TraversalVertexProgram is very clean -- its lays atop the GraphComputer API in 
a natural, effortless way.

However, there is one last pairing that needs some better alignment: 
GraphComputer Memory and Traversal SideEffects. A Memory slot has the notion of 
a key, a value, and a reducer (binary operator). A Traversal SideEffect as the 
notion of a key and a value. I think we should enable Traversal SideEffects to 
support registered reducers. If we do this, then there is perfect alignment 
between the two models and we won't have to have "if(onGraphComputer)"-type 
logic in our side-effect steps.

Right now in GroupCountSideEffectStep we do this:

public void sideEffect(final Traverser<S> traverser) {
  Map<E,Long> groupCountMap = 
this.getTraversal().getSideEffects().get(this.sideEffectKey);
  MapHelper.incr(traverser.get(), traverser.bulk(), groupCountMap)
}

We are explicitly getting the Map from the sideEffects and updating it. This 
model will not generally work in OLAP because groupCountMap is a distributed 
data structure and thus, local updates to a Map don't distribute. I have it 
working currently in master/, but at the cost of not being able to read the 
sideEffect, only write to it. To make TraversalSideEffects consistent across 
both OLTP and OLAP, I think we should express GroupCountSideEffectStep like 
this (*** analogously for GroupSideEffectStep, TreeSideEffectStep, etc.):

public void sideEffect(final Traverser<S> traverser) {
  this.getTraversal().getSideEffects().add(this.sideEffectKey, 
Collections.singletonMap(traverser.get(), traverser.bulk());
}

Moreover, TraversalSideEffects should have the following method:

        TraversalSideEffects.register(final String key, final Supplier<A> 
initialValue, final BinaryOperator<A> reducer)

Note that we already have:

        
https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalSideEffects.java#L88

We can deprecate the current registerSupplier() in support of register(). 
Moreover, for backwards compatibility, BinaryOperator<A> reducer would simply 
be Operator.assign. 
        
https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/Operator.java#L59-L62
Thus, this would not be a breaking change and it will ensure a natural 
congruence between these two related computing structures -- Memory and 
TraversalSideEffects.

RELATED JIRA: https://issues.apache.org/jira/browse/TINKERPOP-1192

Thoughts?,
Marko.

http://markorodriguez.com

Reply via email to