[jira] [Commented] (TINKERPOP3-716) subroutine step

Matt Frantz (JIRA) Thu, 27 Aug 2015 10:14:45 -0700

    [ 
https://issues.apache.org/jira/browse/TINKERPOP3-716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717084#comment-14717084
 ]


Matt Frantz commented on TINKERPOP3-716:
----------------------------------------

The motivation for this RFE derives from my application design, which consists 
of traversals that are independently specified and tested, and then used to 
compose other traversals.

Suppose I compose the _Bar_ algorithm from the _Foo_ algorithm.  Each is 
implemented as a {{Traversal}}.  The Foo algorithm is the _sub-traversal_ and 
the Bar algorithm is the _super-traversal_.  The relationship is similar to a 
parent-child traversal relationship, but with additional restrictions on shared 
state.

There are four principle state variables that we'll consider in the following 
sections:
* Object (value of the traverser)
* Path
* Sack
* Side-Effects

We seek _encapsulation_ of this state, such that a sub-traversal's state does 
not blend with the super-traversal's state except as explicitly designated by 
the traversal author.

*Object*
Encapsulating the logic that modulates Object is already handled via {{map}} 
and {{flatMap}}.  The {{sub}} step would act similarly.  Since {{flatMap}} is 
more general, {{sub}} should probably behave as {{flatMap}}.

*Path*
Path encapsulation involves the step label namespace and the visibility of the 
path at both ends of the sub-traversal.  The guiding principles for proper 
encapsulation are as follows:
* A sub-traversal should (by default) have no visibility into the path of the 
super-traversal.
* A super-traversal should (by default) have no visibility into the path of the 
sub-traversal.
* A super-traversal MAY provide visibility into specific step labels of the 
super-traversal when invoking the sub-traversal.
* A sub-traversal MAY require visibility into specific step labels of the 
super-traversal through a binding/renaming option.
* A sub-traversal MAY provide visibility into specific step labels of the 
sub-traversal through an exporting option.
* A super-traversal MAY require visibility into specific step labels of the 
sub-traversal through a binding/renaming option.

These operations can be efficient, since they involve labeling steps, and 
possibly constructing a {{Path}} "view" with limited visibility into the 
underlying "real path".

_Path Importing_
No matter what Foo (the sub-traversal) does, it should not have access to the 
path of Bar (the super-traversal), unless Bar exports it.  That is, Foo could 
require explicitly importing a portion of Bar's path using something like a 
{{withPath}} modifier of the {{sub}} step.

{code}
Traversal computeFoo() {
  // Use path steps 'x', 'y', 'z' provided from super-traversal.
  Traversal t = select('x', 'y', 'z');
  ...
  return t;
}

Traversal computeBar() {
  Traversal t = ...;
  // Provide Foo with the x-y-z path it requires from our own a-b-c steps.
  t.sub(computeFoo())
    .withPath('x', 'y', 'z')
    .by(select('a'))
    .by(select('b'))
    .by(select('c'));
  ...
  return t;
}
{code}

_Path Exporting_
No matter what Bar (the super-traversal) does, it should not have access to 
labeled steps of Foo (the sub-traversal), unless Foo exports them.  That is, 
Foo could explicitly export a portion of its path using something like a 
{{deselect}} (a.k.a. {{makePath}}) step.

{code}
Traversal computeFoo() {
  Traversal t = ...;
  return t.deselect('x', 'y');
}

Traversal computeBar() {
  Traversal t = ...as('a');
  t.sub(computeFoo()).as('foo');
  t.path();  // would see steps 'a', 'foo.x', and 'foo.y'
  t.select('a', 'foo.x', 'foo.y');  // OK
  return t;
}
{code}

*Sack*
The sack feature is useful, but it is limited because the sack type can only be 
specified at the source.  Here is a use case that involves a "scoped sack" 
implementation.  The "Foo" algorithm uses sack with a {{FooSack}} instance.  
However, "Bar" also wants to use sack, but wants to use its own {{BarSack}} 
instance.

{code}
Traversal computeFoo() {
  return __.withSack(new FooSack())...;
}

Traversal computeBar() {
  Traversal bar = __.withSack(new BarSack())...;
  bar.sub(computeFoo())...;
  return bar;
}
{code}

Thus, this issue is in part about being able to perform the {{withSack}} step 
on some interior traversal.

*Side-Effect*
As a global namespace, the side-effects are powerful, but it can also be 
difficult to coordinate partitioning of this global namespace in a modular 
traversal.  The {{sub}} step would enforce encapsulation of the side-effect 
namespace by default.  Each {{sub}} step would effectively provide a new 
{{SideEffects}} object.

{code}
Traversal computeFoo() {
  return __...store('a')...;
}

Traversal computeBar() {
  Traversal t = ...;
  t.sub(computeBar());
  ...
  t.store('a');  // This is NOT the same 'a' that Foo sees!
  ...
  return t;
}
{code}

You could selectively populate the side-effect of the sub-traversal using 
{{withSideEffect}} as a modifier.

{code}
Traversal computeFoo() {
   return __...select('a')...;  // Access side-effect 'a' which must be 
provided by super-traversal.
}

Traversal computeBar() {
  Traversal t = ...;
  // Provide Foo with its required side-effect key 'a' from our own 'x'.
  t.sub(computeBar())
    .withSideEffect('a', select('x'));
  ...
  return t;
}
{code}

If you want to allow exporting of side-effects directly from the sub-traversal 
into the super-traversal, then something like {{deselect}} or 
{{makeSideEffects}} could be used.

{code}
Traversal computeFoo() {
  Traversal t = ...;
  t.store('x')...store('y')...;
  return t.deselect('x', 'y');
}

Traversal computeBar() {
  Traversal t = ...as('a');
  t.sub(computeFoo()).as('foo');

  // OK, since we resolve from our own path for 'a', plus scoped, exported 
side-effect of Foo.
  t.select('a', 'foo.x', 'foo.y');

  return t;
}
{code}


> subroutine step
> ---------------
>
>                 Key: TINKERPOP3-716
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP3-716
>             Project: TinkerPop 3
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.0.0-incubating
>            Reporter: Matt Frantz
>            Assignee: Marko A. Rodriguez
>
> In our application, we build complex traversals.  We use a modular technique 
> in which lower-level traversals are used to compose higher-level traversals.  
> For example, we might have an API like this:
> {code}
> void computeFoo(GraphTraversal t);
> {code}
> The assumption is that the "root" traversal has some state in the form of its 
> traverser and path (and possible sack and side effects, although we're not 
> using those at the moment).  Importantly, all of this state is not fully 
> realized, as the goal is to build another traversal before executing it all 
> at once.  To use the "foo" traversal in the "bar" traversal, I might do this:
> {code}
> void computeBar(GraphTraversal t) {
>   // Perform some part of the "bar" calculation, including setting up the 
> parameters of "foo"
>   t.out()...;
>   // Attach the "foo" logic.
>   computeFoo(t);
>   // Continue with more "bar" logic.
>   t.blah()...;
> }
> {code}
> In the implementation of these modular traversals, we end up exposing certain 
> implementation details, especially in the form of labeled steps (path keys).  
> It is often the case that we don't necessarily want to leak this state.  We 
> often want these traversals to operate as if they were single steps, 
> discarding any evidence of graph wandering that might have occurred.
> What if there were a step which acted as a "stack" for the path?  (It might 
> also act as a stack for the sack; we figured we would need that if we ever 
> decide to use sack.)  Not only would this provide encapsulation, it would 
> allow the path to be pruned and thus improve runtime efficiency.
> We want traversal subroutines.  How about this signature?
> {code}
> GraphTraversal sub(Traversal subTraversal);
> {code}
> What this would do is guarantee that no path state (or sack state) would 
> escape.  The only effect would be on the traverser.  I think it could be 
> implemented using a {{MapStep}}, or possibly {{FlatMapStep}} in order to 
> realize a subTraversal that branches.
> What state would subTraversal have access to?  By default, it could only see 
> the traverser.  It has its own key namespace for the path, so it cannot see 
> the outer path.  It has its own sack.  It should probably have its own side 
> effect key namespace, too.
> If you want to opt-in state into the subTraversal, then perhaps you could 
> list the keys to form a pseudo-path:
> {code}
> GraphTraversal sub(Traversal subTraversal, String...stepOrSideEffectLabels);
> {code}
> One of the interesting things that this might provide (eventually) is the 
> ability to hang OLTP traversal off of OLAP traversals.  I had discussed this 
> need a while ago in the context of vertices that act as OLAP "computation 
> loci".  A "sub" could use a different engine, and thus embed a standard 
> traversal within a computer traversal.
> This would change our modular programming paradigm.  We could safely redesign 
> each module to produce an anonymous traversal like so:
> {code}
> GraphTraversal computeFoo();
> GraphTraversal computeBar() {
>   // Perform some part of the "bar" calculation, including setting up the 
> parameters of "foo"
>   GraphTraversal t = __.out()...;
>   // Attach the "foo" logic.
>   t.sub(computeFoo(t));
>   // Continue with more "bar" logic.
>   return t.in()...;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TINKERPOP3-716) subroutine step

Reply via email to