Pieter…. you have a fascinating problem. Because this touches exactly at one of 
the fundamental concepts of Gremlin — local vs. global traversals. 

g.V().out().count() == g.V().local(out()).count()
  
However,

g.V().out().count() != g.V().local(out().count())

///////////////

gremlin> g.V().out().count()
==>6
gremlin> g.V().local(out()).count()
==>6
gremlin> g.V().out().count()
==>6
gremlin> g.V().local(out().count())
==>3
==>0
==>0
==>2
==>0
==>1
gremlin>

So, your ability to grab the "global stream” and not just a single object from 
it within a local traversal will require some trickery.

Look at:

g.V().local(out().count())

How would we do this so you pull all the V()’s into the Sqlg[out()] but still 
single stream post process? Thats a good question. Hmmmm…


g.V().aggregate(‘x’).cap(‘x’).local(sqlgOut().count())

Now, Sqlg[out()] would do this:

        1. Is the input a list? If yes, then execute all in batch.
        2. Then pop off the first mapping as the output to count() only!
        3. then Sqlg[out()].reset() method doesn’t clear…..

Wait… no, it can’t do that………………………………cause it will not next() correctly.

Hm. Wow…. mind blown.

Owwww………….

Marko.


> On Apr 20, 2017, at 10:47 AM, pieter <[email protected]> wrote:
> 
> Sorry, forwarding was not a good idea either,
> 
> Here is an example with global children and the batching works well.
> Sqlg does not currently optimize the 'where' (TraversalFilterStep).
> 
>     @Test
>     public void testBatchingIncomingTraversersOnVertexStep() {
>         int count = 10_000;
>         for (int i = 0; i < count; i++) {
>             Vertex a1 = this.sqlgGraph.addVertex(T.label, "A");
>             Vertex b1 = this.sqlgGraph.addVertex(T.label, "B");
>             a1.addEdge("ab", b1);
>         }
>         this.sqlgGraph.tx().commit();
> 
>         GraphTraversal traversal = this.sqlgGraph.traversal()
>                 .V().where(__.hasLabel("A"))
>                 .out();
>         printTraversalForm(traversal);
>         List<Vertex> vertices = traversal.toList();
>         assertEquals(count, vertices.size());
>     }
> 
> This prints out,
> 
> pre-strategy:[GraphStep(vertex,[]),
> TraversalFilterStep([HasStep([~label.eq(A)])]), VertexStep(OUT,vertex)]
> post-strategy:[SqlgGraphStepCompiled(vertex,[])@[sqlgPathFakeLabel],
> HasStep([~label.eq(A)]), SqlgVertexStepCompiled@[sqlgPathFakeLabel]]
> 
> The SqlgVertexStepCompiled is able to iterate all 10 000 incoming
> traversers and execute one query for the out().
> 
> This reduced the query time from 12 seconds to 0.4 seconds. Happiness!!
> 
> An example with a local traversal.
> 
>     @Test
>     public void testBatchingIncomingTraversersOnLocalVertexStep() {
>         int count = 10_000;
>         for (int i = 0; i < count; i++) {
>             Vertex a1 = this.sqlgGraph.addVertex(T.label, "A");
>             Vertex b1 = this.sqlgGraph.addVertex(T.label, "B");
>             a1.addEdge("ab", b1);
>         }
>         this.sqlgGraph.tx().commit();
> 
>         GraphTraversal traversal = this.sqlgGraph.traversal()
>                 .V().hasLabel("A")
>                 .local(
>                         __.out()
>                 );
>         printTraversalForm(traversal);
>         List<Vertex> vertices = traversal.toList();
>         Assert.assertEquals(count, vertices.size());
>     }
> 
> This prints out,
> 
> pre-strategy:[GraphStep(vertex,[]), HasStep([~label.eq(A)]),
> LocalStep([VertexStep(OUT,vertex)])]
> post-strategy:[SqlgGraphStepCompiled(vertex,[])@[sqlgPathFakeLabel],
> LocalStep([SqlgVertexStepCompiled@[sqlgPathFakeLabel]])]
> 
> In this case SqlgVertexStepCompiled is a local traversal of the
> LocalStep.
> 
> Iterating the starts only returns one traverser as the LocalStep only
> puts one on the traversal at a time.
> 
> I suppose I can replace LocalStep with a custom one but there are many
> steps with local children which will make things
> fragile if I were to replace so many steps in a copy paste fashion.
> 
> Thanks
> Pieter
> 
> 
> On Thu, 2017-04-20 at 09:10 -0600, Marko Rodriguez wrote:
>> Hello,
>> 
>>> I have started optimizing Sqlg to do a bulk/barrier for its
>>> VertexStep
>>> optimizations.
>> 
>> Cool.
>> 
>>> Sqlg has two optimization strategies.
>>> 
>>> GraphStepStrategy and VertexStepStrategy. GraphStepStrategy
>>> executes
>>> first and then VertexStepStrategy.
>>> 
>>> GraphStepStrategy starts at the beginning of the traversal
>>> optimizing
>>> from left to right till it reaches a step that it can not optimize
>>> and
>>> terminates.
>> 
>> Makes sense.
>> 
>>> After that VertexStepStrategy tries to optimize what remains.
>>> It ultimately replaces optimizable sequential steps with a
>>> SqlgVertexStep.
>> 
>> Okay...
>> 
>>> Thus far the SqlgVertexStep always has one incoming traverser from
>>> where it continues the traversal. Basically it translated to a sql
>>> where clause with the incoming traversal element's id.
>>> 
>>> The current optimization is to bulk the incoming traversers and
>>> execute
>>> the traversal for all incoming traversers in one go. This reduces
>>> latency and has a drastic performance improvement.
>>> 
>>> I do the same as the existing BarrierSteps and iterate the `starts`
>>> to
>>> collect all the left incoming traversers and from there I continue
>>> and
>>> all is well.
>> 
>> Smart. You got chops.
>> 
>>> However for local traversals there is only one start on the
>>> traversal
>>> so the barrier idea is not working.
>>> 
>>> Is there a way barrier all incoming left traversers on local
>>> traversals?
>> 
>> 
>> Eeeeeeeeeeeeeeee…… huuuuuuuhhhhhhhh…………….
>> 
>> There is a “easy” and there is a “hard.” Give me an example traversal
>> and lets discuss from a more specific standpoint before
>> generalizing...
>> 
>> Thanks,
>> Marko.
>> 
>> http://markorodriguez.com

Reply via email to