[
https://issues.apache.org/jira/browse/TINKERPOP3-863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941343#comment-14941343
]
Marko A. Rodriguez commented on TINKERPOP3-863:
-----------------------------------------------
Yes, you are correct -- this duality you note is the fundamental problem.
I don't quite understand what you mean by {{onMerge}} and {{onSplit}} wrt
"boundary." We have those two methods in {{Traverser.Admin}} (called {{split}}
and {{merge}}).
https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/Traverser.java#L156-L175
Now, the behavior for "bulk" is not configurable at runtime like it is for
"sack." For "bulk," its always {{this.bulk = this.bulk + other.bulk}}. With
"sack," in {{SideEffects}} you get:
https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalSideEffects.java#L122-L147
This is how we can do:
{code}
g.withSack(1.0,clone(),sum) // initial value, split operator, merge operator
{code}
However, bulk is still ++ incr'ing along as sacks are being split and merge.
The fundamental problem is that if you want to use sack to represent "energy"
(a real number) you run into this oddity.
{code}
gremlin> g.withSack(1.0f,sum).V(1).local(out('knows').barrier(normSack)) // good
==>v[2]
==>v[4]
gremlin> g.withSack(1.0f,sum).V(1).local(out('knows').barrier(normSack)).sack()
// good again
==>0.5
==>0.5
gremlin>
g.withSack(1.0f,sum).V(1).local(out('knows').barrier(normSack)).in('knows') //
okay two traverser back at v[1]
==>v[1]
==>v[1]
gremlin>
g.withSack(1.0f,sum).V(1).local(out('knows').barrier(normSack)).in('knows').sack()
// yep, and their sacks are 0.5 each
==>0.5
==>0.5
gremlin>
g.withSack(1.0f,sum).V(1).local(out('knows').barrier(normSack)).in('knows').barrier().sack()
// merged into a single traverser with sack 1.0, but bulk 2!!!!
==>1.0
==>1.0
{code}
> [Proposal] Turn off bulking -- or is there something more general? (hope not).
> ------------------------------------------------------------------------------
>
> Key: TINKERPOP3-863
> URL: https://issues.apache.org/jira/browse/TINKERPOP3-863
> Project: TinkerPop 3
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.1.0-incubating
> Reporter: Marko A. Rodriguez
> Assignee: Marko A. Rodriguez
> Fix For: 3.1.0-incubating
>
>
> I have a general question -- sometimes you want bulking and sometimes you
> don't. Why would you no want bulking? Well, lets say you have sack being 1.0
> and you want to represent energy diffusion and thus, if a traverser splits
> and goes to two adjacent neighbors, then each sack will be 0.5. Now, lets say
> those two traverser merge on the next step (a diamond shaped graph), the
> merged traverser's sack is 1.0 (excellent!). However, its bulk is 2.
> Dah............. Then the total energy in the graph is 2.0.
> Should we simply have "bulk" and "no bulk" or do we come up with a "bulk
> merge" model where users can ONLY add bulks (current default and the only
> method), multiple bulks, min/max bulks, etc. etc…………………….. Scared that the
> generalization might be an overkill.
> The difference is:
> {code}
> g.withBulk(false)….. // binary -- don't use bulking.
> g.withBulk(true)... // default behavior that is currently just sum the bulks
> together.
> // or do we go with
> g.withBulk(mult)….. // when two traversers merge, multiply their bulks.. why
> would you do that, I have no idea, but its general.
> g.withBulk(one) … // would be like binary=false .. always merge to 1 and
> thus, one BinaryOpeartor(x,y) -> 1
> {code}
> Is this generalization of the bulk merge operator useful? Or do we say -- if
> you want to do complex functions on "energy" (bulk), you do it via
> sack........................
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)