Hi everyone,
I’ve been thinking about the removal of `aggregate(local)` in favour of the
composition of `local(aggregate())` which was
initially raised in Michael Schmidt’s proposal on Lazy vs Eager traversal
evaluation:
https://github.com/apache/tinkerpop/blob/2c3f31fdab535913c5a7b318e16f1c80bce57f35/docs/src/dev/future/proposal-scoping-5.asciidoc#proposed-further-simplifications
I believe there is merit in taking this proposal a little further. Michael made
a good point which is that TinkerPop has never
explicitly defined if providers are expected to mirror gremlin-core’s
pull-based traversal engine to be considered fully
compliant. My opinion here is that providers should be free to implement their
own traversal engine which uses a lazy,
eager, or hybrid evaluation model. This freedom does lead to some portability
issues with gremlin however, particularly
with traversals such as `g.V().groupCount(“x”).select(“x”)`:
In a lazy evaluation engine, this produces the following results where you can
see the map incrementally update:
==>[v[1]:1]
==>[v[1]:1,v[2]:1]
==>[v[1]:1,v[2]:1,v[3]:1]
==>[v[1]:1,v[2]:1,v[3]:1,v[4]:1]
==>[v[1]:1,v[2]:1,v[3]:1,v[4]:1,v[5]:1]
==>[v[1]:1,v[2]:1,v[3]:1,v[4]:1,v[5]:1,v[6]:1]
In an eager engine, the each result is instead the fully computed map:
==>[v[1]:1,v[2]:1,v[3]:1,v[4]:1,v[5]:1,v[6]:1]
==>[v[1]:1,v[2]:1,v[3]:1,v[4]:1,v[5]:1,v[6]:1]
==>[v[1]:1,v[2]:1,v[3]:1,v[4]:1,v[5]:1,v[6]:1]
==>[v[1]:1,v[2]:1,v[3]:1,v[4]:1,v[5]:1,v[6]:1]
==>[v[1]:1,v[2]:1,v[3]:1,v[4]:1,v[5]:1,v[6]:1]
==>[v[1]:1,v[2]:1,v[3]:1,v[4]:1,v[5]:1,v[6]:1]
This discrepancy is present in GroupSideEffectStep, GroupCountSideEffectStep,
TreeSideEffectStep, and SubgraphStep. Based
on my investigation, these “non-blocking, accumulating side effect steps” are
the only steps in gremlin which may lead to
inconsistent results when comparing lazy and eager traversal engines (excluding
AggregateLocalStep as that has been slated for
removal following the lazy/eager proposal). I’d like to propose adding barriers
to all 4 of these steps, which will bring their behaviour
in line with AggregateGlobalStep, and will block all traversers until the
starts have fully iterated and the side effect has been
completely processed. Any users who require the old non-blocking lazy results
may achieve that by embedding the step in local():
gremlin> g.V().local(groupCount("x")).select("x")
==>[v[1]:1]
==>[v[1]:1,v[2]:1]
==>[v[1]:1,v[2]:1,v[3]:1]
==>[v[1]:1,v[2]:1,v[3]:1,v[4]:1]
==>[v[1]:1,v[2]:1,v[3]:1,v[4]:1,v[5]:1]
==>[v[1]:1,v[2]:1,v[3]:1,v[4]:1,v[5]:1,v[6]:1]
The benefit of the above pattern is that it will give the same result structure
regardless of if the traversal engine is lazy or eager.
This gives providers the freedom to use the evaluation strategy of their
choosing, without worrying about inconsistent query results
or importable gremlin. It’s also worth noting that most usages of these side
effect steps are paired with cap(), which itself is a barrier
which hides the differences between a lazy or eager engine. This distinction is
only relevant when these side effect steps are paired
with select() instead of cap().
Unless there are any objections raised, I plan to move forward with this
change, and intend to open a PR soon which adds the
proposed barriers to these steps.
Thanks,
Cole