[
https://issues.apache.org/jira/browse/TINKERPOP3-866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942466#comment-14942466
]
Matt Frantz commented on TINKERPOP3-866:
----------------------------------------
What if we create a new {{GraphTraversal}} DSL interface for each set of
breaking changes, allowing the entire existing interface to be deprecated?
Then users can opt in to the specific dialect they want using whatever
mechanism we have for choosing a DSL (the mechanism which I don't know off the
top of my head). We could then target the change in the default DSL to a
specific version, with enough advanced warning to please folks, while retaining
the older DSL's as an option for a certain number of releases.
> GroupStep and Traversal-Based Reductions
> ----------------------------------------
>
> Key: TINKERPOP3-866
> URL: https://issues.apache.org/jira/browse/TINKERPOP3-866
> Project: TinkerPop 3
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.0.1-incubating
> Reporter: Marko A. Rodriguez
> Assignee: Marko A. Rodriguez
> Labels: breaking
> Fix For: 3.1.0-incubating
>
>
> Right now {{GroupStep}} is defined as:
> {code}
> public final class GroupStep<S, K, V, R> extends ReducingBarrierStep<S,
> Map<K, R>> implements MapReducer, TraversalParent {
> private Traversal.Admin<S, K> keyTraversal = null;
> private Traversal.Admin<S, V> valueTraversal = null;
> private Traversal.Admin<Collection<V>, R> reduceTraversal = null;
> ...
> {code}
> Look at {{reduceTraversal}}. It takes a {{Collection<V>}} of "values" and
> reduces them to a "reduction" {{R}}. Why are we using {{Collection<V>}}, why
> is this not:
> {code}
> private Traversal.Admin<V, R> reduceTraversal = null;
> {code}
> Now, when a new {{K}} is created (and reduce is defined), we clone
> {{reduceTraversal}}. Thus, each key has a {{reduceTraversal}} (identical
> clones) that operate in a stream like fashion on {{V}} to yield {{R}}. This
> enables us to remove the {{Collection<V>}} (memory hog) and allows us to
> defined {{GroupCountStep}} in terms of {{GroupStep}} without (?limited?)
> computational cost. HOWEVER, this changes the API as people who did this:
> {code}
> g.V.group.by(label()).by(outE().count()).by(sum(local))
> {code}
> would now have to do this:
> {code}
> g.V.group.by(label()).by(outE().count()).by(sum())
> {code}
> Its very minor, given the speed up we would gain and the ability for us to
> now do "groupCount" efficiently on arbitrary values -- not just bulks (e.g.
> sacks).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)