Re: Diffusion Algorithms with Gremlin

Marko Rodriguez Tue, 01 Sep 2015 13:58:31 -0700

Hello Matt (others),

Perhaps you have some thoughts on this issue:
        https://issues.apache.org/jira/browse/TINKERPOP3-825


Finally, here is the new sack-merge operator in action. As you can see, we can 
now do energy diffusions. If you make your sack value a function of sin/cos, 
then you can simulate wave dynamics and all the neat things that come with that 
(e.g. quantum probabilities due to superposition of traverser state).

// starting a marko, the traverser has a sack of 1.0.
gremlin> g.withSack(1.0d,sum).V(1).sack()
==>1.0

// taking the outgoing knows-edges from marko, the sacks of the split 
traversers are normalized.
gremlin> g.withSack(1.0d,sum).V(1).local(out('knows').barrier(normSack)).sack()
==>0.5
==>0.5

// because the merge operator is sum, sacks collapse via sum
gremlin> 
g.withSack(1.0d,sum).V(1).local(out('knows').barrier(normSack)).in('knows').barrier().sack()
==>1.0
==>1.0

Why two 1.0s?!?! Well, that is what the ticket above is all about. The "right" 
solution is:

gremlin> 
g.withSack(1.0d,sum).V(1).local(out('knows').barrier(normSack)).in('knows').barrier().sideEffect{it.setBulk(1)}.sack()
==>1.0

Please provide any feedback you may have,
Marko.

http://markorodriguez.com

On Sep 1, 2015, at 9:57 AM, Marko Rodriguez <[email protected]> wrote:

> Hello,
> 
>> You say "the merge operator can easily just be: g.withSack(1.0, sum)", but
>> that is the syntax for the split operator, so do you need a third parameter
>> there?
> 
> The are multiple overloads. Merge is a BiFunction and split is a 
> UnaryOperator -- see new_merge/ branch.
> 
>> Also, is normalizeSack a user-defined traversal?  In this case, it would be
>> doing math on the floating point sack values?
> 
> No, its a Consumer<TraverserSet<?>> and used by CollectingBarrierStep.
> 
>> One of the interesting (and relevant for my application) possibilities is
>> to direct the priority of the traverser execution based on what you might
>> call "propagation energy".  That is, with each split or merge (or based on
>> other "amplification" criteria), the traversers lose momentum, and the
>> limited compute resources are applied to the traversers with the highest
>> momentum.  I suppose this is a vendor-specific feature, but would there be
>> a story for directing the priority of execution of the set of traversers?
> 
> You can always where(sack().gt(0.01)).
> 
> HTH,
> Marko.
> 
> http://markorodriguez.com
> 
> 
>> 
>> On Mon, Aug 31, 2015 at 11:06 AM, Marko Rodriguez <[email protected]>
>> wrote:
>> 
>>> Hello,
>>> 
>>> In TinkerPop 3.0.0, at the last minute, I got rid of the sack() merge
>>> operator as I was lost in how we would do merge traverser sacks. Why did I
>>> get gun shy? Watch:
>>> 
>>> gremlin> g = TinkerFactory.createModern().traversal()
>>> ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
>>> gremlin> g.withSack(1.0).V(1).outE().inV().sack()
>>> ==>1.0
>>> ==>1.0
>>> ==>1.0
>>> 
>>> So v[1] has 3 outgoing edges and so it generates three traversers and the
>>> sack of the parent traverser is simply copied to the children. What if you
>>> want to preserve a total sack value in the traversal. That is, there can
>>> never be more than 1.0 "energy" in the graph. Well, if your edges have
>>> weights (lets say) and they always are fraction of 1.0, well that would
>>> work. *** NOTE  that the TinkerGraph modern graph is not 1.0 normalized. ***
>>> 
>>> gremlin> g.withSack(1.0d).V(1).outE().sack(mult).by('weight').inV().sack()
>>> ==>0.4
>>> ==>0.5
>>> ==>1.0
>>> 
>>> Hmm.. but if they are not 1.0 normalized well, that sucks. And its hard to
>>> keep such data consistent with large graphs constantly adding and removing
>>> edges…And what if you don't have weights!!?
>>> 
>>> For the last week I was going down this rabbit hole of "split operators"
>>> for the withSack() source method. It was all nasty and convoluted. However,
>>> last night and into this morning, I think we can solve this simply with a
>>> "barrier consumer." Watch.
>>> 
>>> gremlin>
>>> g.withSack(1.0).V(1).local(outE().barrier(normalizeSack)).inV().sack()
>>> ==>0.3333333333333333
>>> ==>0.3333333333333333
>>> ==>0.3333333333333333
>>> 
>>> Cool. What about if you want to have it normalized by edge weights?
>>> 
>>> gremlin>
>>> g.withSack(1.0).V(1).local(outE().sack(mult).by('weight').barrier(normalizeSack)).inV().sack()
>>> ==>0.2105263157894737
>>> ==>0.2631578947368421
>>> ==>0.5263157894736842
>>> 
>>> Notice how local(barrier()) gathers all the traversers generated by outE()
>>> for the current object and then allows you to do some mutation on them.
>>> NormalizeSack simply recompute sacks based on the aggregate.
>>> 
>>> With this, the merge operator can easily just be: g.withSack(1.0,sum). And
>>> then, we can support furcating and converging "energy" in the graph.
>>> 
>>> This will lead into some very very trippy (theoretical) work I've been
>>> doing with constructive and destructive wave interference models with
>>> Gremlin. We will be able to support "optical algorithms" -- refraction,
>>> diffusion, interference, etc.
>>> 
>>> Any thoughts/concerns/recommendations?
>>> 
>>> Marko.
>>> 
>>> http://markorodriguez.com
>>> 
>>> 
>

Re: Diffusion Algorithms with Gremlin

Reply via email to