Re: [DISCUSS] Depth First Repeat step

2018-05-20 Thread Keith Lohnes
I had a little bit of time this weekend so I've updated the PR that I'd
submitted https://github.com/apache/tinkerpop/pull/838 with something that
seems to work, at least with the cursory testing I've done so far and jives
better with the sentiments expressed in this thread.

It still uses a similar overall design for the RepeatStep, but it checks to
see if there's a barrier step `.repeat(barrier())...` as Daniel had
suggested closer to the start of the thread instead of an explicit
`order(BFS)`. So the check wouldn't be, as Pieter was concerned about,
checking that a strategy hasn't been added but that no strategies have
added a `barrier` step to the repeat traversal in the RepeatStep.

On Thu, May 17, 2018 at 3:18 PM Stephen Mallette 
wrote:

> I got a bit busy for a few weeks and didn't' really have much time to focus
> on in on this thread. I figured I'd let it off my mind for a while and see
> if any new thoughts came up on it. Coming back on it now, I still think I
> like the approach I summarized from my own thoughts and others on the
> thread, but let me try to address a few things that were posted since my
> last reply:
>
> From pieter's last reply:
>
> >  What bothers me with this is that its in the negative. Providers now
> have too look for a specific strategy NOT being present to alter the
> behavior of the traversal.
>
> I suppose sqlg has gone down a path where the general strategies of
> TinkerPop don't always equate to the best way to do things. Do any other
> graph providers run into such an issue? I don't necessarily ask that
> question of Pieter, but of any graph providers out there who are following
> along. I'm personally not aware of any, but I'd be curious to know if there
> were others.
>
> >  Another thought about if we go with the `.withoutStrategies(Lazy...)`.
> Should these semantic marker type strategies not then be interfaces rather
> than concrete implementations.
>
> that might be true. we'd need to look at that more carefully in the
> implementation to see what makes sense. maybe we just need a marker
> interface for such strategies?
>
> From Keith's last reply:
>
> >  my main concern is there’s not really any great documentation around
> what LazyBarrierStrategy really does.
>
> true - we will want to write more about this for sure if it's to control
> the traversal algorithm.
>
> On Tue, May 1, 2018 at 9:53 AM, Keith Lohnes  wrote:
>
> > I presume that would be some source of technical derailment to this
> > approach
> >
> > Not necessarily, just the “step to define traversal order” idea. The
> > RepeatStep, at least from what I’ve seen, seems simplest to implement as
> a
> > BFS. BFS is easily made depth first with a stack. I think that can still
> be
> > the case with the RepeatStep here, it would just need another means of
> > determining which algorithm to use. That could probably be “Does this
> > repeat traversal end with a barrier step?” If that seems reasonable for
> > RepeatSteps then maybe that could help with some of Pieter’s “check for a
> > negative” concerns.
> >
> > Full BFS: manually add barrier()‘s
> >
> > Mixed mode: Default, let strategies do their thing OR remove strategies
> and
> > manually add your own barrier()
> > Full DFS: execute .withoutStrategies(Lazy...)
> >
> > This sounds reasonable, but my main concern is there’s not really any
> great
> > documentation around what LazyBarrierStrategy really does. Sure, you can
> > read the code, but I think that as the documentation currently sits,
> > inserts
> > barrier()-steps into a traversal where appropriate in order to gain the
> > "bulking optimization." is a bit vague. “Where appropriate” should
> probably
> > be well defined in the documentation so users can have a good
> understanding
> > of when that is without having to go to the code.
> >
> > On Sun, Apr 29, 2018 at 6:17 AM pieter gmail 
> > wrote:
> >
> > So to clarify, you write,
> > >
> > > Full DFS: execute `.withoutStrategies(Lazy...)`
> > >
> > > What bothers me with this is that its in the negative. Providers now
> > > have too look for a specific strategy NOT being present to alter the
> > > behavior of the traversal.
> > >
> > > SubgraphStrategy and EventStrategy and all others that I am aware of
> are
> > > positive. They do what they do and then the provider goes about its
> > > business optimizing the steps that are present. In the negative its
> more
> > > like a directive to the provider about the semantics of the traversal's
> > > steps.
> > >
> > > Another thought about if we go with the `.withoutStrategies(Lazy...)`.
> > > Should these semantic marker type strategies not then be interfaces
> > > rather than concrete implementations. Specifically with respect my
> vague
> > > understanding of whats coming with version 4, where the gremlin
> > > specification is completely detached from TinkerPop's reference
> > > implementation and perhaps even the JDK 

Re: [DISCUSS] Depth First Repeat step

2018-05-17 Thread Stephen Mallette
I got a bit busy for a few weeks and didn't' really have much time to focus
on in on this thread. I figured I'd let it off my mind for a while and see
if any new thoughts came up on it. Coming back on it now, I still think I
like the approach I summarized from my own thoughts and others on the
thread, but let me try to address a few things that were posted since my
last reply:

>From pieter's last reply:

>  What bothers me with this is that its in the negative. Providers now
have too look for a specific strategy NOT being present to alter the
behavior of the traversal.

I suppose sqlg has gone down a path where the general strategies of
TinkerPop don't always equate to the best way to do things. Do any other
graph providers run into such an issue? I don't necessarily ask that
question of Pieter, but of any graph providers out there who are following
along. I'm personally not aware of any, but I'd be curious to know if there
were others.

>  Another thought about if we go with the `.withoutStrategies(Lazy...)`.
Should these semantic marker type strategies not then be interfaces rather
than concrete implementations.

that might be true. we'd need to look at that more carefully in the
implementation to see what makes sense. maybe we just need a marker
interface for such strategies?

>From Keith's last reply:

>  my main concern is there’s not really any great documentation around
what LazyBarrierStrategy really does.

true - we will want to write more about this for sure if it's to control
the traversal algorithm.

On Tue, May 1, 2018 at 9:53 AM, Keith Lohnes  wrote:

> I presume that would be some source of technical derailment to this
> approach
>
> Not necessarily, just the “step to define traversal order” idea. The
> RepeatStep, at least from what I’ve seen, seems simplest to implement as a
> BFS. BFS is easily made depth first with a stack. I think that can still be
> the case with the RepeatStep here, it would just need another means of
> determining which algorithm to use. That could probably be “Does this
> repeat traversal end with a barrier step?” If that seems reasonable for
> RepeatSteps then maybe that could help with some of Pieter’s “check for a
> negative” concerns.
>
> Full BFS: manually add barrier()‘s
>
> Mixed mode: Default, let strategies do their thing OR remove strategies and
> manually add your own barrier()
> Full DFS: execute .withoutStrategies(Lazy...)
>
> This sounds reasonable, but my main concern is there’s not really any great
> documentation around what LazyBarrierStrategy really does. Sure, you can
> read the code, but I think that as the documentation currently sits,
> inserts
> barrier()-steps into a traversal where appropriate in order to gain the
> "bulking optimization." is a bit vague. “Where appropriate” should probably
> be well defined in the documentation so users can have a good understanding
> of when that is without having to go to the code.
>
> On Sun, Apr 29, 2018 at 6:17 AM pieter gmail 
> wrote:
>
> So to clarify, you write,
> >
> > Full DFS: execute `.withoutStrategies(Lazy...)`
> >
> > What bothers me with this is that its in the negative. Providers now
> > have too look for a specific strategy NOT being present to alter the
> > behavior of the traversal.
> >
> > SubgraphStrategy and EventStrategy and all others that I am aware of are
> > positive. They do what they do and then the provider goes about its
> > business optimizing the steps that are present. In the negative its more
> > like a directive to the provider about the semantics of the traversal's
> > steps.
> >
> > Another thought about if we go with the `.withoutStrategies(Lazy...)`.
> > Should these semantic marker type strategies not then be interfaces
> > rather than concrete implementations. Specifically with respect my vague
> > understanding of whats coming with version 4, where the gremlin
> > specification is completely detached from TinkerPop's reference
> > implementation and perhaps even the JDK itself.
> >
> > Cheers
> > Pieter
> >
> > On 29/04/2018 00:38, Stephen Mallette wrote:
> > >>   Is the objection to order(SearchAlgo) that it overloads order() or
> an
> > > objection to specifying, DFS/BFS in the traversal itself?
> > >
> > > for me, order() is a step. not a modulator. making it work as presented
> > in
> > > the pull request conflates those two concepts which i think is
> confusing
> > (i
> > > don't think we do that anywhere else, but feel free to correct me if
> i'm
> > > wrong).
> > >
> > >>   As mentioned earlier I am not a fan of making the user deal with
> > > strategies.  Strategies to my mind is not part of the gremlin as a
> > language
> > > specification
> > >
> > > There are already precedents for users dealing with strategies.
> Consider
> > > the decorations like SubgraphStrategy, EventStrategy, etc. So, it is
> not
> > > foreign or hidden. It is a component of traversal configuration.
> > >
> > > I would have to think 

Re: [DISCUSS] Depth First Repeat step

2018-05-01 Thread Keith Lohnes
I presume that would be some source of technical derailment to this
approach

Not necessarily, just the “step to define traversal order” idea. The
RepeatStep, at least from what I’ve seen, seems simplest to implement as a
BFS. BFS is easily made depth first with a stack. I think that can still be
the case with the RepeatStep here, it would just need another means of
determining which algorithm to use. That could probably be “Does this
repeat traversal end with a barrier step?” If that seems reasonable for
RepeatSteps then maybe that could help with some of Pieter’s “check for a
negative” concerns.

Full BFS: manually add barrier()‘s

Mixed mode: Default, let strategies do their thing OR remove strategies and
manually add your own barrier()
Full DFS: execute .withoutStrategies(Lazy...)

This sounds reasonable, but my main concern is there’s not really any great
documentation around what LazyBarrierStrategy really does. Sure, you can
read the code, but I think that as the documentation currently sits, inserts
barrier()-steps into a traversal where appropriate in order to gain the
"bulking optimization." is a bit vague. “Where appropriate” should probably
be well defined in the documentation so users can have a good understanding
of when that is without having to go to the code.

On Sun, Apr 29, 2018 at 6:17 AM pieter gmail 
wrote:

So to clarify, you write,
>
> Full DFS: execute `.withoutStrategies(Lazy...)`
>
> What bothers me with this is that its in the negative. Providers now
> have too look for a specific strategy NOT being present to alter the
> behavior of the traversal.
>
> SubgraphStrategy and EventStrategy and all others that I am aware of are
> positive. They do what they do and then the provider goes about its
> business optimizing the steps that are present. In the negative its more
> like a directive to the provider about the semantics of the traversal's
> steps.
>
> Another thought about if we go with the `.withoutStrategies(Lazy...)`.
> Should these semantic marker type strategies not then be interfaces
> rather than concrete implementations. Specifically with respect my vague
> understanding of whats coming with version 4, where the gremlin
> specification is completely detached from TinkerPop's reference
> implementation and perhaps even the JDK itself.
>
> Cheers
> Pieter
>
> On 29/04/2018 00:38, Stephen Mallette wrote:
> >>   Is the objection to order(SearchAlgo) that it overloads order() or an
> > objection to specifying, DFS/BFS in the traversal itself?
> >
> > for me, order() is a step. not a modulator. making it work as presented
> in
> > the pull request conflates those two concepts which i think is confusing
> (i
> > don't think we do that anywhere else, but feel free to correct me if i'm
> > wrong).
> >
> >>   As mentioned earlier I am not a fan of making the user deal with
> > strategies.  Strategies to my mind is not part of the gremlin as a
> language
> > specification
> >
> > There are already precedents for users dealing with strategies. Consider
> > the decorations like SubgraphStrategy, EventStrategy, etc. So, it is not
> > foreign or hidden. It is a component of traversal configuration.
> >
> > I would have to think about a new step like traverse(). I'd prefer to
> avoid
> > new steps when we already have other infrastructure from the language
> that
> > can support what we're trying to do. I think that's why I like the
> approach
> > I outlined in my last post - it just sorta fits with what we have.
> >
> >
> >
> > On Sat, Apr 28, 2018 at 1:37 PM, pieter gmail 
> > wrote:
> >
> >> Hi,
> >>
> >> Is the objection to order(SearchAlgo) that it overloads order() or an
> >> objection to specifying, DFS/BFS in the traversal itself?
> >> If so I do not really see how it is misplaced from a usability/API
> >> perspective. Seems pretty natural to me and very graphy at that.
> >>
> >> As mentioned earlier I am not a fan of making the user deal with
> >> strategies. Strategies to my mind is not part of the gremlin as a
> language
> >> specification. You mention LazyBarrierStrategy yet it is a internal
> >> strategy to TinkerPop. Sqlg removes it and passes all tests without me
> even
> >> being aware anymore if TinkerPop would have had a LazyBarrierStrategy or
> >> not. I see this as a strength of TinkerPop. Strategies can evolve over
> time
> >> without it having a effect on the specification and user's codebase out
> >> there.
> >>
> >> Here comes where my thoughts have gone to,
> >>
> >> g.V().anythingTraversal(anyTraversal()).traverse(DFS/BFS)
> >>
> >> This should apply to all traversals, not just RepeatStep. e.g.
> >> g.V().out().out().traverse(DFS).
> >> If no traverse(DFS/BFS) is specified then the graph provider can do
> >> whatever they want, DFS, BFS or a mix of the two.
> >> If however traverse(BFS/DFS) is specified then the graph provider must
> >> follow the directive and the test suite will test for it.
> >>
> >> 

Re: [DISCUSS] Depth First Repeat step

2018-04-28 Thread Stephen Mallette
>  Is the objection to order(SearchAlgo) that it overloads order() or an
objection to specifying, DFS/BFS in the traversal itself?

for me, order() is a step. not a modulator. making it work as presented in
the pull request conflates those two concepts which i think is confusing (i
don't think we do that anywhere else, but feel free to correct me if i'm
wrong).

>  As mentioned earlier I am not a fan of making the user deal with
strategies.  Strategies to my mind is not part of the gremlin as a language
specification

There are already precedents for users dealing with strategies. Consider
the decorations like SubgraphStrategy, EventStrategy, etc. So, it is not
foreign or hidden. It is a component of traversal configuration.

I would have to think about a new step like traverse(). I'd prefer to avoid
new steps when we already have other infrastructure from the language that
can support what we're trying to do. I think that's why I like the approach
I outlined in my last post - it just sorta fits with what we have.



On Sat, Apr 28, 2018 at 1:37 PM, pieter gmail 
wrote:

> Hi,
>
> Is the objection to order(SearchAlgo) that it overloads order() or an
> objection to specifying, DFS/BFS in the traversal itself?
> If so I do not really see how it is misplaced from a usability/API
> perspective. Seems pretty natural to me and very graphy at that.
>
> As mentioned earlier I am not a fan of making the user deal with
> strategies. Strategies to my mind is not part of the gremlin as a language
> specification. You mention LazyBarrierStrategy yet it is a internal
> strategy to TinkerPop. Sqlg removes it and passes all tests without me even
> being aware anymore if TinkerPop would have had a LazyBarrierStrategy or
> not. I see this as a strength of TinkerPop. Strategies can evolve over time
> without it having a effect on the specification and user's codebase out
> there.
>
> Here comes where my thoughts have gone to,
>
> g.V().anythingTraversal(anyTraversal()).traverse(DFS/BFS)
>
> This should apply to all traversals, not just RepeatStep. e.g.
> g.V().out().out().traverse(DFS).
> If no traverse(DFS/BFS) is specified then the graph provider can do
> whatever they want, DFS, BFS or a mix of the two.
> If however traverse(BFS/DFS) is specified then the graph provider must
> follow the directive and the test suite will test for it.
>
> traverse(DFS/BFS) is like any other traversal in that it can be specified
> mid traversal. Everything before it must follow the directive, everything
> after need not to.
>
> This will also be backward compatible which is kinda nice.
>
> Cheers
> Pieter
>
> On 27/04/2018 21:03, Stephen Mallette wrote:
>
>> It seems like we have general agreement on the easy things, that is:
>>
>> 1. this is a change for 3.4.0/master and
>> 2. we're all for a DFS option
>>
>> but we still have the hard part of having to come to consensus on how it's
>> used/implemented. The quick summary of this thread in that regard goes
>> something like this: We currently have this PR that introduces DFS, but
>> does so as a configuration per repeat() step. From what I gather a good
>> many of us seem to find that approach undesirable for one or more of the
>> following reasons:
>>
>> 1. The use of order() seems misplaced purely from a usability/API
>> perspective
>> 2. The approach seems to be at odds with how everything else works given
>> barrier() and strategies
>> 3. The approach seems to be at odds with our current mixed mode of DFS/BFS
>>
>> I think that we can see those issues resolve themselves with something
>> Kuppitz mentioned to me: repeat() should be DFS by default where barrier()
>> will change that behavior as required. That change would yield the
>> following approaches:
>>
>> Full BFS: manually add `barrier()`'s
>> Mixed mode: Default, let strategies do their thing OR remove strategies
>> and
>> manually add your own barrier()
>> Full DFS: execute `.withoutStrategies(Lazy...)`
>>
>> Futherrmore, we probably should have some form of verification strategy
>> that ensures all BFS or all DFS so that users can't get tricked along the
>> way. It's not enough to just remove LazyBarrierStrategy to get DFS if
>> another strategy comes along and throws in a barrier().
>>
>> So if all that sounds good from a usability perspective, then we get all
>> three modes that we want using existing traversal semantics which removes
>> the three concerns I've summarized from this thread. We also get Keith's
>> desire to have control over which part of a traversal is BFS/DFS if users
>> want that capability because they can do a manual Mixed Mode and add their
>> own barrier() to control the flow. For Pieter (or any graph provider)
>> nothing really changes and there is opportunity to control flow with
>> strategies as usual.
>>
>> I haven't really focused much on what's involved in adapting the current
>> work in the PR to this approach as I more wanted to find the common ground
>> among all the 

Re: [DISCUSS] Depth First Repeat step

2018-04-28 Thread pieter gmail

Hi,

Is the objection to order(SearchAlgo) that it overloads order() or an 
objection to specifying, DFS/BFS in the traversal itself?
If so I do not really see how it is misplaced from a usability/API 
perspective. Seems pretty natural to me and very graphy at that.


As mentioned earlier I am not a fan of making the user deal with 
strategies. Strategies to my mind is not part of the gremlin as a 
language specification. You mention LazyBarrierStrategy yet it is a 
internal strategy to TinkerPop. Sqlg removes it and passes all tests 
without me even being aware anymore if TinkerPop would have had a 
LazyBarrierStrategy or not. I see this as a strength of TinkerPop. 
Strategies can evolve over time without it having a effect on the 
specification and user's codebase out there.


Here comes where my thoughts have gone to,

g.V().anythingTraversal(anyTraversal()).traverse(DFS/BFS)

This should apply to all traversals, not just RepeatStep. e.g. 
g.V().out().out().traverse(DFS).
If no traverse(DFS/BFS) is specified then the graph provider can do 
whatever they want, DFS, BFS or a mix of the two.
If however traverse(BFS/DFS) is specified then the graph provider must 
follow the directive and the test suite will test for it.


traverse(DFS/BFS) is like any other traversal in that it can be 
specified mid traversal. Everything before it must follow the directive, 
everything after need not to.


This will also be backward compatible which is kinda nice.

Cheers
Pieter

On 27/04/2018 21:03, Stephen Mallette wrote:

It seems like we have general agreement on the easy things, that is:

1. this is a change for 3.4.0/master and
2. we're all for a DFS option

but we still have the hard part of having to come to consensus on how it's
used/implemented. The quick summary of this thread in that regard goes
something like this: We currently have this PR that introduces DFS, but
does so as a configuration per repeat() step. From what I gather a good
many of us seem to find that approach undesirable for one or more of the
following reasons:

1. The use of order() seems misplaced purely from a usability/API
perspective
2. The approach seems to be at odds with how everything else works given
barrier() and strategies
3. The approach seems to be at odds with our current mixed mode of DFS/BFS

I think that we can see those issues resolve themselves with something
Kuppitz mentioned to me: repeat() should be DFS by default where barrier()
will change that behavior as required. That change would yield the
following approaches:

Full BFS: manually add `barrier()`'s
Mixed mode: Default, let strategies do their thing OR remove strategies and
manually add your own barrier()
Full DFS: execute `.withoutStrategies(Lazy...)`

Futherrmore, we probably should have some form of verification strategy
that ensures all BFS or all DFS so that users can't get tricked along the
way. It's not enough to just remove LazyBarrierStrategy to get DFS if
another strategy comes along and throws in a barrier().

So if all that sounds good from a usability perspective, then we get all
three modes that we want using existing traversal semantics which removes
the three concerns I've summarized from this thread. We also get Keith's
desire to have control over which part of a traversal is BFS/DFS if users
want that capability because they can do a manual Mixed Mode and add their
own barrier() to control the flow. For Pieter (or any graph provider)
nothing really changes and there is opportunity to control flow with
strategies as usual.

I haven't really focused much on what's involved in adapting the current
work in the PR to this approach as I more wanted to find the common ground
among all the people who commented on the thread. If we agree that this is
a nice way to go, then we can think more about "how" it could happen.

Keith, I saw you mention earlier that:


  The barrier step that Daniel described doesn’t currently work since

there’s basically booleans in the RepeatStep on whether or not to stash the
starts to make the RepeatStep depth first.

I presume that would be some source of technical derailment to this
approach.




On Tue, Apr 24, 2018 at 3:05 PM, Keith Lohnes  wrote:


Yeah, that's what I meant. The steps inside are replaced with some
JanusGraph stuff.

Cheers,
Keith


On Tue, Apr 24, 2018 at 1:52 PM pieter gmail 
wrote:


Nah, that looks to me like the RepeatStep survived. Just the nested
VertexStep that got replaced with JanusgraphVertexStep.
Good for them, first prize is not replacing anything.

Cheers
Pieter

On 24/04/2018 19:50, Keith Lohnes wrote:

It looks like it,
`g.V().has("foo", "bar").repeat(out()).emit().explain()`

yields

`[JanusGraphStep([],[foo.eq(bar)]),
RepeatStep([JanusGraphVertexStep(OUT,vertex),
RepeatEndStep],until(false),emit(true))]`



On Tue, Apr 24, 2018 at 12:12 PM pieter gmail 

Re: [DISCUSS] Depth First Repeat step

2018-04-27 Thread Stephen Mallette
It seems like we have general agreement on the easy things, that is:

1. this is a change for 3.4.0/master and
2. we're all for a DFS option

but we still have the hard part of having to come to consensus on how it's
used/implemented. The quick summary of this thread in that regard goes
something like this: We currently have this PR that introduces DFS, but
does so as a configuration per repeat() step. From what I gather a good
many of us seem to find that approach undesirable for one or more of the
following reasons:

1. The use of order() seems misplaced purely from a usability/API
perspective
2. The approach seems to be at odds with how everything else works given
barrier() and strategies
3. The approach seems to be at odds with our current mixed mode of DFS/BFS

I think that we can see those issues resolve themselves with something
Kuppitz mentioned to me: repeat() should be DFS by default where barrier()
will change that behavior as required. That change would yield the
following approaches:

Full BFS: manually add `barrier()`'s
Mixed mode: Default, let strategies do their thing OR remove strategies and
manually add your own barrier()
Full DFS: execute `.withoutStrategies(Lazy...)`

Futherrmore, we probably should have some form of verification strategy
that ensures all BFS or all DFS so that users can't get tricked along the
way. It's not enough to just remove LazyBarrierStrategy to get DFS if
another strategy comes along and throws in a barrier().

So if all that sounds good from a usability perspective, then we get all
three modes that we want using existing traversal semantics which removes
the three concerns I've summarized from this thread. We also get Keith's
desire to have control over which part of a traversal is BFS/DFS if users
want that capability because they can do a manual Mixed Mode and add their
own barrier() to control the flow. For Pieter (or any graph provider)
nothing really changes and there is opportunity to control flow with
strategies as usual.

I haven't really focused much on what's involved in adapting the current
work in the PR to this approach as I more wanted to find the common ground
among all the people who commented on the thread. If we agree that this is
a nice way to go, then we can think more about "how" it could happen.

Keith, I saw you mention earlier that:

>  The barrier step that Daniel described doesn’t currently work since
there’s basically booleans in the RepeatStep on whether or not to stash the
starts to make the RepeatStep depth first.

I presume that would be some source of technical derailment to this
approach.




On Tue, Apr 24, 2018 at 3:05 PM, Keith Lohnes  wrote:

> Yeah, that's what I meant. The steps inside are replaced with some
> JanusGraph stuff.
>
> Cheers,
> Keith
>
>
> On Tue, Apr 24, 2018 at 1:52 PM pieter gmail 
> wrote:
>
> > Nah, that looks to me like the RepeatStep survived. Just the nested
> > VertexStep that got replaced with JanusgraphVertexStep.
> > Good for them, first prize is not replacing anything.
> >
> > Cheers
> > Pieter
> >
> > On 24/04/2018 19:50, Keith Lohnes wrote:
> > > It looks like it,
> > > `g.V().has("foo", "bar").repeat(out()).emit().explain()`
> > >
> > > yields
> > >
> > > `[JanusGraphStep([],[foo.eq(bar)]),
> > > RepeatStep([JanusGraphVertexStep(OUT,vertex),
> > > RepeatEndStep],until(false),emit(true))]`
> > >
> > >
> > >
> > > On Tue, Apr 24, 2018 at 12:12 PM pieter gmail  >
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> Sqlg completely replaces TinkerPop's RepeatStep. The idea being that
> > >> with g.V().repeat(out()).times(x) only x round trips to the db is
> needed
> > >> regardless of the size of the graph. Each time it will go to the db
> with
> > >> the full set of the previous step's incoming starts.
> > >>
> > >> But yeah TinkerPop's implementation is always the starting point so
> I'll
> > >> definitely have a look at how you have implemented DFS.
> > >>
> > >> BTW, does Janus graph use TinkerPop's default RepeatStep as is with no
> > >> optimization strategies?
> > >>
> > >> Cheers
> > >> Pieter
> > >>
> > >> On 24/04/2018 16:33, Keith Lohnes wrote:
> > >>> Pieter,
> > >>>
> > >>> If you take a look at https://github.com/apache/tinkerpop/pull/838
> DFS
> > >> is
> > >>> implemented as a modification to BFS. It's taking the starts that
> come
> > in
> > >>> from a BFS and stashing them to be processed later. I haven't seen a
> > big
> > >>> performance difference on JanusGraph; At least for the queries that
> > I've
> > >>> been running with it. I'm not terribly familiar with Sqlg, but I
> wonder
> > >> if
> > >>> in the context of how DFS is implemented there, it may be less of a
> > >>> concern.
> > >>>
> > >>> Cheers,
> > >>> Keith
> > >>>
> > >>> On Thu, Apr 19, 2018 at 12:46 PM pieter gmail <
> pieter.mar...@gmail.com
> > >
> > >>> wrote:
> > >>>
> >  Hi,
> > 
> >  Not really sure either what 'global' means 

Re: [DISCUSS] Depth First Repeat step

2018-04-24 Thread Keith Lohnes
Yeah, that's what I meant. The steps inside are replaced with some
JanusGraph stuff.

Cheers,
Keith


On Tue, Apr 24, 2018 at 1:52 PM pieter gmail 
wrote:

> Nah, that looks to me like the RepeatStep survived. Just the nested
> VertexStep that got replaced with JanusgraphVertexStep.
> Good for them, first prize is not replacing anything.
>
> Cheers
> Pieter
>
> On 24/04/2018 19:50, Keith Lohnes wrote:
> > It looks like it,
> > `g.V().has("foo", "bar").repeat(out()).emit().explain()`
> >
> > yields
> >
> > `[JanusGraphStep([],[foo.eq(bar)]),
> > RepeatStep([JanusGraphVertexStep(OUT,vertex),
> > RepeatEndStep],until(false),emit(true))]`
> >
> >
> >
> > On Tue, Apr 24, 2018 at 12:12 PM pieter gmail 
> > wrote:
> >
> >> Hi,
> >>
> >> Sqlg completely replaces TinkerPop's RepeatStep. The idea being that
> >> with g.V().repeat(out()).times(x) only x round trips to the db is needed
> >> regardless of the size of the graph. Each time it will go to the db with
> >> the full set of the previous step's incoming starts.
> >>
> >> But yeah TinkerPop's implementation is always the starting point so I'll
> >> definitely have a look at how you have implemented DFS.
> >>
> >> BTW, does Janus graph use TinkerPop's default RepeatStep as is with no
> >> optimization strategies?
> >>
> >> Cheers
> >> Pieter
> >>
> >> On 24/04/2018 16:33, Keith Lohnes wrote:
> >>> Pieter,
> >>>
> >>> If you take a look at https://github.com/apache/tinkerpop/pull/838 DFS
> >> is
> >>> implemented as a modification to BFS. It's taking the starts that come
> in
> >>> from a BFS and stashing them to be processed later. I haven't seen a
> big
> >>> performance difference on JanusGraph; At least for the queries that
> I've
> >>> been running with it. I'm not terribly familiar with Sqlg, but I wonder
> >> if
> >>> in the context of how DFS is implemented there, it may be less of a
> >>> concern.
> >>>
> >>> Cheers,
> >>> Keith
> >>>
> >>> On Thu, Apr 19, 2018 at 12:46 PM pieter gmail  >
> >>> wrote:
> >>>
>  Hi,
> 
>  Not really sure either what 'global' means technically with respect to
>  TinkerPop's current configuration support.
>  Perhaps it can be the start of a global configuration registry that
> can
>  be overridden per traversal.
> 
>  I get that DFS is the preferred default but for Sqlg the performance
>  impact is so great that I'd rather, if possible have BFS as its
> default.
> 
>  I am not sure about this but I reckon that any graph where the
> TinkerPop
>  vm is not running in the same process space as the actual graph/data
>  that latency is a big issue. BFS alleviates the latency issue
>  significantly.
> 
>  Cheers
>  Pieter
> 
> 
> 
>  On 19/04/2018 14:49, Keith Lohnes wrote:
> >> whether this will affect more than just
> > repeat()?
> >
> > For the PR that I start and the intent of this thread was to only
> >> affect
> > repeat.
> >
> >> I prefer that the semantics of the traversal be specified in the
>  traversal
> > as a first class citizen.
> >
> > +1
> >
> >> I am fine with any default but am wondering whether it would be
> > worthwhile for the default to be overridden at a global level at not
> >> just
> > per traversal
> >
> > It might be nice, but I guess I'm not sure what `global` means in
> this
> > context. Configured on the graph object?
> >
> > On Tue, Apr 17, 2018 at 9:28 PM Daniel Kuppitz 
> >> wrote:
> >> TinkerPop makes no guarantees about the order of elements unless you
> >> specify an explicit order. This also goes back to the fact that
> >> certain
> >> strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy)
> >> add
> >> NoOpBarrierSteps to your traversal, which ultimately turns it into a
> >> DFS/BFS mix. Check the .explain() output of your traversal to see
> >> which
> >> strategy adds which steps.
> >>
> >> Cheers,
> >> Daniel
> >>
> >>
> >> On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier <
> >> mich...@michaelpollmeier.com> wrote:
> >>
>  Also it seems to me that DFS only really applies to repeat() with
> an
>  emit().
>  g.V().hasLabel("A").repeat().times(2) gets rewritten as
>  g.V().hasLabel("A").out().out(). Are their subtleties that I am
> not
>  aware of or does DFV vs BFS not matter in this case?
> >>> When I read this I thought: clearly `.out().out()` is DFS for OLTP,
> >>> that's also what the documentation says, e.g. in this nice
> >> infographic
> >>> http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png
> >>>
> >>> However, looks like that's not the case. Has my life been a lie?
>  Setting
> >>> up a simple flat graph to make things more obvious:
> >>> v3 <- v1 <- v0 -> v2 -> v4
> 

Re: [DISCUSS] Depth First Repeat step

2018-04-24 Thread pieter gmail
Nah, that looks to me like the RepeatStep survived. Just the nested 
VertexStep that got replaced with JanusgraphVertexStep.

Good for them, first prize is not replacing anything.

Cheers
Pieter

On 24/04/2018 19:50, Keith Lohnes wrote:

It looks like it,
`g.V().has("foo", "bar").repeat(out()).emit().explain()`

yields

`[JanusGraphStep([],[foo.eq(bar)]),
RepeatStep([JanusGraphVertexStep(OUT,vertex),
RepeatEndStep],until(false),emit(true))]`



On Tue, Apr 24, 2018 at 12:12 PM pieter gmail 
wrote:


Hi,

Sqlg completely replaces TinkerPop's RepeatStep. The idea being that
with g.V().repeat(out()).times(x) only x round trips to the db is needed
regardless of the size of the graph. Each time it will go to the db with
the full set of the previous step's incoming starts.

But yeah TinkerPop's implementation is always the starting point so I'll
definitely have a look at how you have implemented DFS.

BTW, does Janus graph use TinkerPop's default RepeatStep as is with no
optimization strategies?

Cheers
Pieter

On 24/04/2018 16:33, Keith Lohnes wrote:

Pieter,

If you take a look at https://github.com/apache/tinkerpop/pull/838 DFS

is

implemented as a modification to BFS. It's taking the starts that come in
from a BFS and stashing them to be processed later. I haven't seen a big
performance difference on JanusGraph; At least for the queries that I've
been running with it. I'm not terribly familiar with Sqlg, but I wonder

if

in the context of how DFS is implemented there, it may be less of a
concern.

Cheers,
Keith

On Thu, Apr 19, 2018 at 12:46 PM pieter gmail 
wrote:


Hi,

Not really sure either what 'global' means technically with respect to
TinkerPop's current configuration support.
Perhaps it can be the start of a global configuration registry that can
be overridden per traversal.

I get that DFS is the preferred default but for Sqlg the performance
impact is so great that I'd rather, if possible have BFS as its default.

I am not sure about this but I reckon that any graph where the TinkerPop
vm is not running in the same process space as the actual graph/data
that latency is a big issue. BFS alleviates the latency issue
significantly.

Cheers
Pieter



On 19/04/2018 14:49, Keith Lohnes wrote:

whether this will affect more than just

repeat()?

For the PR that I start and the intent of this thread was to only

affect

repeat.


I prefer that the semantics of the traversal be specified in the

traversal

as a first class citizen.

+1


I am fine with any default but am wondering whether it would be

worthwhile for the default to be overridden at a global level at not

just

per traversal

It might be nice, but I guess I'm not sure what `global` means in this
context. Configured on the graph object?

On Tue, Apr 17, 2018 at 9:28 PM Daniel Kuppitz 

wrote:

TinkerPop makes no guarantees about the order of elements unless you
specify an explicit order. This also goes back to the fact that

certain

strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy)

add

NoOpBarrierSteps to your traversal, which ultimately turns it into a
DFS/BFS mix. Check the .explain() output of your traversal to see

which

strategy adds which steps.

Cheers,
Daniel


On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier <
mich...@michaelpollmeier.com> wrote:


Also it seems to me that DFS only really applies to repeat() with an
emit().
g.V().hasLabel("A").repeat().times(2) gets rewritten as
g.V().hasLabel("A").out().out(). Are their subtleties that I am not
aware of or does DFV vs BFS not matter in this case?

When I read this I thought: clearly `.out().out()` is DFS for OLTP,
that's also what the documentation says, e.g. in this nice

infographic

http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png

However, looks like that's not the case. Has my life been a lie?

Setting

up a simple flat graph to make things more obvious:
v3 <- v1 <- v0 -> v2 -> v4

```
graph = TinkerGraph.open()
v0 = graph.addVertex("l0")
v1 = graph.addVertex("l1")
v2 = graph.addVertex("l1")
v3 = graph.addVertex("l2")
v4 = graph.addVertex("l2")
v0.addEdge("e", v2)
v2.addEdge("e", v4)
v0.addEdge("e", v1)
v1.addEdge("e", v3)
g = graph.traversal()
g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)}
```

Prints:
v[2]
v[1]
v[4]
==>v[4]
v[3]
==>v[3]

If this was OLTP the output would be:
v[2]
v[4]
==>v[4]
v[1]
v[3]
==>v[3]

Cheers
Michael

On 18/04/18 02:58, pieter gmail wrote:

Hi,

I agree with the question about whether this will affect more than

just

repeat()?

I prefer that the semantics of the traversal be specified in the
traversal as a first class citizen. i.e. with order(SearchAlgo).
Strategies are to my mind internal to an implementation. In Robert's
example LazyBarrierStrategy may be replaced/removed by an

implementation

for whatever internal reason they have.

Regarding the default, I am fine with any default but am wondering
whether it 

Re: [DISCUSS] Depth First Repeat step

2018-04-24 Thread Keith Lohnes
It looks like it,
`g.V().has("foo", "bar").repeat(out()).emit().explain()`

yields

`[JanusGraphStep([],[foo.eq(bar)]),
RepeatStep([JanusGraphVertexStep(OUT,vertex),
RepeatEndStep],until(false),emit(true))]`



On Tue, Apr 24, 2018 at 12:12 PM pieter gmail 
wrote:

> Hi,
>
> Sqlg completely replaces TinkerPop's RepeatStep. The idea being that
> with g.V().repeat(out()).times(x) only x round trips to the db is needed
> regardless of the size of the graph. Each time it will go to the db with
> the full set of the previous step's incoming starts.
>
> But yeah TinkerPop's implementation is always the starting point so I'll
> definitely have a look at how you have implemented DFS.
>
> BTW, does Janus graph use TinkerPop's default RepeatStep as is with no
> optimization strategies?
>
> Cheers
> Pieter
>
> On 24/04/2018 16:33, Keith Lohnes wrote:
> > Pieter,
> >
> > If you take a look at https://github.com/apache/tinkerpop/pull/838 DFS
> is
> > implemented as a modification to BFS. It's taking the starts that come in
> > from a BFS and stashing them to be processed later. I haven't seen a big
> > performance difference on JanusGraph; At least for the queries that I've
> > been running with it. I'm not terribly familiar with Sqlg, but I wonder
> if
> > in the context of how DFS is implemented there, it may be less of a
> > concern.
> >
> > Cheers,
> > Keith
> >
> > On Thu, Apr 19, 2018 at 12:46 PM pieter gmail 
> > wrote:
> >
> >> Hi,
> >>
> >> Not really sure either what 'global' means technically with respect to
> >> TinkerPop's current configuration support.
> >> Perhaps it can be the start of a global configuration registry that can
> >> be overridden per traversal.
> >>
> >> I get that DFS is the preferred default but for Sqlg the performance
> >> impact is so great that I'd rather, if possible have BFS as its default.
> >>
> >> I am not sure about this but I reckon that any graph where the TinkerPop
> >> vm is not running in the same process space as the actual graph/data
> >> that latency is a big issue. BFS alleviates the latency issue
> >> significantly.
> >>
> >> Cheers
> >> Pieter
> >>
> >>
> >>
> >> On 19/04/2018 14:49, Keith Lohnes wrote:
>  whether this will affect more than just
> >>> repeat()?
> >>>
> >>> For the PR that I start and the intent of this thread was to only
> affect
> >>> repeat.
> >>>
>  I prefer that the semantics of the traversal be specified in the
> >> traversal
> >>> as a first class citizen.
> >>>
> >>> +1
> >>>
> I am fine with any default but am wondering whether it would be
> >>> worthwhile for the default to be overridden at a global level at not
> just
> >>> per traversal
> >>>
> >>> It might be nice, but I guess I'm not sure what `global` means in this
> >>> context. Configured on the graph object?
> >>>
> >>> On Tue, Apr 17, 2018 at 9:28 PM Daniel Kuppitz 
> wrote:
> >>>
>  TinkerPop makes no guarantees about the order of elements unless you
>  specify an explicit order. This also goes back to the fact that
> certain
>  strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy)
> add
>  NoOpBarrierSteps to your traversal, which ultimately turns it into a
>  DFS/BFS mix. Check the .explain() output of your traversal to see
> which
>  strategy adds which steps.
> 
>  Cheers,
>  Daniel
> 
> 
>  On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier <
>  mich...@michaelpollmeier.com> wrote:
> 
> >> Also it seems to me that DFS only really applies to repeat() with an
> >> emit().
> >> g.V().hasLabel("A").repeat().times(2) gets rewritten as
> >> g.V().hasLabel("A").out().out(). Are their subtleties that I am not
> >> aware of or does DFV vs BFS not matter in this case?
> > When I read this I thought: clearly `.out().out()` is DFS for OLTP,
> > that's also what the documentation says, e.g. in this nice
> infographic
> > http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png
> >
> > However, looks like that's not the case. Has my life been a lie?
> >> Setting
> > up a simple flat graph to make things more obvious:
> > v3 <- v1 <- v0 -> v2 -> v4
> >
> > ```
> > graph = TinkerGraph.open()
> > v0 = graph.addVertex("l0")
> > v1 = graph.addVertex("l1")
> > v2 = graph.addVertex("l1")
> > v3 = graph.addVertex("l2")
> > v4 = graph.addVertex("l2")
> > v0.addEdge("e", v2)
> > v2.addEdge("e", v4)
> > v0.addEdge("e", v1)
> > v1.addEdge("e", v3)
> > g = graph.traversal()
> > g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)}
> > ```
> >
> > Prints:
> > v[2]
> > v[1]
> > v[4]
> > ==>v[4]
> > v[3]
> > ==>v[3]
> >
> > If this was OLTP the output would be:
> > v[2]
> > v[4]
> > ==>v[4]
> > v[1]
> > v[3]
> > ==>v[3]
> >
> > Cheers
> > Michael
> 

Re: [DISCUSS] Depth First Repeat step

2018-04-24 Thread pieter gmail

Hi,

Sqlg completely replaces TinkerPop's RepeatStep. The idea being that 
with g.V().repeat(out()).times(x) only x round trips to the db is needed 
regardless of the size of the graph. Each time it will go to the db with 
the full set of the previous step's incoming starts.


But yeah TinkerPop's implementation is always the starting point so I'll 
definitely have a look at how you have implemented DFS.


BTW, does Janus graph use TinkerPop's default RepeatStep as is with no 
optimization strategies?


Cheers
Pieter

On 24/04/2018 16:33, Keith Lohnes wrote:

Pieter,

If you take a look at https://github.com/apache/tinkerpop/pull/838 DFS is
implemented as a modification to BFS. It's taking the starts that come in
from a BFS and stashing them to be processed later. I haven't seen a big
performance difference on JanusGraph; At least for the queries that I've
been running with it. I'm not terribly familiar with Sqlg, but I wonder if
in the context of how DFS is implemented there, it may be less of a
concern.

Cheers,
Keith

On Thu, Apr 19, 2018 at 12:46 PM pieter gmail 
wrote:


Hi,

Not really sure either what 'global' means technically with respect to
TinkerPop's current configuration support.
Perhaps it can be the start of a global configuration registry that can
be overridden per traversal.

I get that DFS is the preferred default but for Sqlg the performance
impact is so great that I'd rather, if possible have BFS as its default.

I am not sure about this but I reckon that any graph where the TinkerPop
vm is not running in the same process space as the actual graph/data
that latency is a big issue. BFS alleviates the latency issue
significantly.

Cheers
Pieter



On 19/04/2018 14:49, Keith Lohnes wrote:

whether this will affect more than just

repeat()?

For the PR that I start and the intent of this thread was to only affect
repeat.


I prefer that the semantics of the traversal be specified in the

traversal

as a first class citizen.

+1


   I am fine with any default but am wondering whether it would be

worthwhile for the default to be overridden at a global level at not just
per traversal

It might be nice, but I guess I'm not sure what `global` means in this
context. Configured on the graph object?

On Tue, Apr 17, 2018 at 9:28 PM Daniel Kuppitz  wrote:


TinkerPop makes no guarantees about the order of elements unless you
specify an explicit order. This also goes back to the fact that certain
strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy) add
NoOpBarrierSteps to your traversal, which ultimately turns it into a
DFS/BFS mix. Check the .explain() output of your traversal to see which
strategy adds which steps.

Cheers,
Daniel


On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier <
mich...@michaelpollmeier.com> wrote:


Also it seems to me that DFS only really applies to repeat() with an
emit().
g.V().hasLabel("A").repeat().times(2) gets rewritten as
g.V().hasLabel("A").out().out(). Are their subtleties that I am not
aware of or does DFV vs BFS not matter in this case?

When I read this I thought: clearly `.out().out()` is DFS for OLTP,
that's also what the documentation says, e.g. in this nice infographic
http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png

However, looks like that's not the case. Has my life been a lie?

Setting

up a simple flat graph to make things more obvious:
v3 <- v1 <- v0 -> v2 -> v4

```
graph = TinkerGraph.open()
v0 = graph.addVertex("l0")
v1 = graph.addVertex("l1")
v2 = graph.addVertex("l1")
v3 = graph.addVertex("l2")
v4 = graph.addVertex("l2")
v0.addEdge("e", v2)
v2.addEdge("e", v4)
v0.addEdge("e", v1)
v1.addEdge("e", v3)
g = graph.traversal()
g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)}
```

Prints:
v[2]
v[1]
v[4]
==>v[4]
v[3]
==>v[3]

If this was OLTP the output would be:
v[2]
v[4]
==>v[4]
v[1]
v[3]
==>v[3]

Cheers
Michael

On 18/04/18 02:58, pieter gmail wrote:

Hi,

I agree with the question about whether this will affect more than

just

repeat()?

I prefer that the semantics of the traversal be specified in the
traversal as a first class citizen. i.e. with order(SearchAlgo).
Strategies are to my mind internal to an implementation. In Robert's
example LazyBarrierStrategy may be replaced/removed by an

implementation

for whatever internal reason they have.

Regarding the default, I am fine with any default but am wondering
whether it would be worthwhile for the default to be overridden at a
global level at not just per traversal? That way the impact can also

be

alleviated when folks upgrade.

Also it seems to me that DFS only really applies to repeat() with an
emit().
g.V().hasLabel("A").repeat().times(2) gets rewritten as
g.V().hasLabel("A").out().out(). Are their subtleties that I am not
aware of or does DFV vs BFS not matter in this case?

Cheers
Pieter

On 17/04/2018 14:58, Robert Dale wrote:

+1 on DFS
-1 on order(SearchAlgo)

It seems like a strategy may 

Re: [DISCUSS] Depth First Repeat step

2018-04-24 Thread Keith Lohnes
Pieter,

If you take a look at https://github.com/apache/tinkerpop/pull/838 DFS is
implemented as a modification to BFS. It's taking the starts that come in
from a BFS and stashing them to be processed later. I haven't seen a big
performance difference on JanusGraph; At least for the queries that I've
been running with it. I'm not terribly familiar with Sqlg, but I wonder if
in the context of how DFS is implemented there, it may be less of a
concern.

Cheers,
Keith

On Thu, Apr 19, 2018 at 12:46 PM pieter gmail 
wrote:

> Hi,
>
> Not really sure either what 'global' means technically with respect to
> TinkerPop's current configuration support.
> Perhaps it can be the start of a global configuration registry that can
> be overridden per traversal.
>
> I get that DFS is the preferred default but for Sqlg the performance
> impact is so great that I'd rather, if possible have BFS as its default.
>
> I am not sure about this but I reckon that any graph where the TinkerPop
> vm is not running in the same process space as the actual graph/data
> that latency is a big issue. BFS alleviates the latency issue
> significantly.
>
> Cheers
> Pieter
>
>
>
> On 19/04/2018 14:49, Keith Lohnes wrote:
> >> whether this will affect more than just
> > repeat()?
> >
> > For the PR that I start and the intent of this thread was to only affect
> > repeat.
> >
> >> I prefer that the semantics of the traversal be specified in the
> traversal
> > as a first class citizen.
> >
> > +1
> >
> >>   I am fine with any default but am wondering whether it would be
> > worthwhile for the default to be overridden at a global level at not just
> > per traversal
> >
> > It might be nice, but I guess I'm not sure what `global` means in this
> > context. Configured on the graph object?
> >
> > On Tue, Apr 17, 2018 at 9:28 PM Daniel Kuppitz  wrote:
> >
> >> TinkerPop makes no guarantees about the order of elements unless you
> >> specify an explicit order. This also goes back to the fact that certain
> >> strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy) add
> >> NoOpBarrierSteps to your traversal, which ultimately turns it into a
> >> DFS/BFS mix. Check the .explain() output of your traversal to see which
> >> strategy adds which steps.
> >>
> >> Cheers,
> >> Daniel
> >>
> >>
> >> On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier <
> >> mich...@michaelpollmeier.com> wrote:
> >>
>  Also it seems to me that DFS only really applies to repeat() with an
>  emit().
>  g.V().hasLabel("A").repeat().times(2) gets rewritten as
>  g.V().hasLabel("A").out().out(). Are their subtleties that I am not
>  aware of or does DFV vs BFS not matter in this case?
> >>> When I read this I thought: clearly `.out().out()` is DFS for OLTP,
> >>> that's also what the documentation says, e.g. in this nice infographic
> >>> http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png
> >>>
> >>> However, looks like that's not the case. Has my life been a lie?
> Setting
> >>> up a simple flat graph to make things more obvious:
> >>> v3 <- v1 <- v0 -> v2 -> v4
> >>>
> >>> ```
> >>> graph = TinkerGraph.open()
> >>> v0 = graph.addVertex("l0")
> >>> v1 = graph.addVertex("l1")
> >>> v2 = graph.addVertex("l1")
> >>> v3 = graph.addVertex("l2")
> >>> v4 = graph.addVertex("l2")
> >>> v0.addEdge("e", v2)
> >>> v2.addEdge("e", v4)
> >>> v0.addEdge("e", v1)
> >>> v1.addEdge("e", v3)
> >>> g = graph.traversal()
> >>> g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)}
> >>> ```
> >>>
> >>> Prints:
> >>> v[2]
> >>> v[1]
> >>> v[4]
> >>> ==>v[4]
> >>> v[3]
> >>> ==>v[3]
> >>>
> >>> If this was OLTP the output would be:
> >>> v[2]
> >>> v[4]
> >>> ==>v[4]
> >>> v[1]
> >>> v[3]
> >>> ==>v[3]
> >>>
> >>> Cheers
> >>> Michael
> >>>
> >>> On 18/04/18 02:58, pieter gmail wrote:
>  Hi,
> 
>  I agree with the question about whether this will affect more than
> just
>  repeat()?
> 
>  I prefer that the semantics of the traversal be specified in the
>  traversal as a first class citizen. i.e. with order(SearchAlgo).
>  Strategies are to my mind internal to an implementation. In Robert's
>  example LazyBarrierStrategy may be replaced/removed by an
> >> implementation
>  for whatever internal reason they have.
> 
>  Regarding the default, I am fine with any default but am wondering
>  whether it would be worthwhile for the default to be overridden at a
>  global level at not just per traversal? That way the impact can also
> be
>  alleviated when folks upgrade.
> 
>  Also it seems to me that DFS only really applies to repeat() with an
>  emit().
>  g.V().hasLabel("A").repeat().times(2) gets rewritten as
>  g.V().hasLabel("A").out().out(). Are their subtleties that I am not
>  aware of or does DFV vs BFS not matter in this case?
> 
>  Cheers
>  Pieter
> 
>  On 17/04/2018 14:58, Robert Dale wrote:
> 

Re: [DISCUSS] Depth First Repeat step

2018-04-19 Thread pieter gmail

Hi,

Not really sure either what 'global' means technically with respect to 
TinkerPop's current configuration support.
Perhaps it can be the start of a global configuration registry that can 
be overridden per traversal.


I get that DFS is the preferred default but for Sqlg the performance 
impact is so great that I'd rather, if possible have BFS as its default.


I am not sure about this but I reckon that any graph where the TinkerPop 
vm is not running in the same process space as the actual graph/data 
that latency is a big issue. BFS alleviates the latency issue significantly.


Cheers
Pieter



On 19/04/2018 14:49, Keith Lohnes wrote:

whether this will affect more than just

repeat()?

For the PR that I start and the intent of this thread was to only affect
repeat.


I prefer that the semantics of the traversal be specified in the traversal

as a first class citizen.

+1


  I am fine with any default but am wondering whether it would be

worthwhile for the default to be overridden at a global level at not just
per traversal

It might be nice, but I guess I'm not sure what `global` means in this
context. Configured on the graph object?

On Tue, Apr 17, 2018 at 9:28 PM Daniel Kuppitz  wrote:


TinkerPop makes no guarantees about the order of elements unless you
specify an explicit order. This also goes back to the fact that certain
strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy) add
NoOpBarrierSteps to your traversal, which ultimately turns it into a
DFS/BFS mix. Check the .explain() output of your traversal to see which
strategy adds which steps.

Cheers,
Daniel


On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier <
mich...@michaelpollmeier.com> wrote:


Also it seems to me that DFS only really applies to repeat() with an
emit().
g.V().hasLabel("A").repeat().times(2) gets rewritten as
g.V().hasLabel("A").out().out(). Are their subtleties that I am not
aware of or does DFV vs BFS not matter in this case?

When I read this I thought: clearly `.out().out()` is DFS for OLTP,
that's also what the documentation says, e.g. in this nice infographic
http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png

However, looks like that's not the case. Has my life been a lie? Setting
up a simple flat graph to make things more obvious:
v3 <- v1 <- v0 -> v2 -> v4

```
graph = TinkerGraph.open()
v0 = graph.addVertex("l0")
v1 = graph.addVertex("l1")
v2 = graph.addVertex("l1")
v3 = graph.addVertex("l2")
v4 = graph.addVertex("l2")
v0.addEdge("e", v2)
v2.addEdge("e", v4)
v0.addEdge("e", v1)
v1.addEdge("e", v3)
g = graph.traversal()
g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)}
```

Prints:
v[2]
v[1]
v[4]
==>v[4]
v[3]
==>v[3]

If this was OLTP the output would be:
v[2]
v[4]
==>v[4]
v[1]
v[3]
==>v[3]

Cheers
Michael

On 18/04/18 02:58, pieter gmail wrote:

Hi,

I agree with the question about whether this will affect more than just
repeat()?

I prefer that the semantics of the traversal be specified in the
traversal as a first class citizen. i.e. with order(SearchAlgo).
Strategies are to my mind internal to an implementation. In Robert's
example LazyBarrierStrategy may be replaced/removed by an

implementation

for whatever internal reason they have.

Regarding the default, I am fine with any default but am wondering
whether it would be worthwhile for the default to be overridden at a
global level at not just per traversal? That way the impact can also be
alleviated when folks upgrade.

Also it seems to me that DFS only really applies to repeat() with an
emit().
g.V().hasLabel("A").repeat().times(2) gets rewritten as
g.V().hasLabel("A").out().out(). Are their subtleties that I am not
aware of or does DFV vs BFS not matter in this case?

Cheers
Pieter

On 17/04/2018 14:58, Robert Dale wrote:

+1 on DFS
-1 on order(SearchAlgo)

It seems like a strategy may be more appropriate.  It could affect

more

than just repeat().  And how would this interact with
LazyBarrierStrategy?

Maybe the default should be DFS with LazyBarrierStrategy. Then
LazyBarrierStrategy
can be removed with 'withoutStrategies()' and then it works just like
everything else.  I'd prefer consistency with the way everything else
works.



Robert Dale

On Tue, Apr 17, 2018 at 8:11 AM, Stephen Mallette <

spmalle...@gmail.com

wrote:


Thanks for the summary. Just a quick note - I'd not worry about the

GLV

tests for now. That part should be easy to sort out. Let's first make
sure
that we get clear on the other items first before digging too deeply
there.

On an administrative front, I think that this change should just go

to

3.4.0/master (so it's currently targeted to the right branch
actually) as
it sounds like we want DFS to be the default and that could be a
breaking
change as the semantics of the traversal shift a bit. So that's one
issue
to make sure we're all happy with.

A next issue would be the API from the user perspective - thanks for

the

link to that JIRA btw - I see I was 

Re: [DISCUSS] Depth First Repeat step

2018-04-19 Thread Keith Lohnes
> whether this will affect more than just
repeat()?

For the PR that I start and the intent of this thread was to only affect
repeat.

> I prefer that the semantics of the traversal be specified in the traversal
as a first class citizen.

+1

>  I am fine with any default but am wondering whether it would be
worthwhile for the default to be overridden at a global level at not just
per traversal

It might be nice, but I guess I'm not sure what `global` means in this
context. Configured on the graph object?

On Tue, Apr 17, 2018 at 9:28 PM Daniel Kuppitz  wrote:

> TinkerPop makes no guarantees about the order of elements unless you
> specify an explicit order. This also goes back to the fact that certain
> strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy) add
> NoOpBarrierSteps to your traversal, which ultimately turns it into a
> DFS/BFS mix. Check the .explain() output of your traversal to see which
> strategy adds which steps.
>
> Cheers,
> Daniel
>
>
> On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier <
> mich...@michaelpollmeier.com> wrote:
>
> > > Also it seems to me that DFS only really applies to repeat() with an
> > > emit().
> > > g.V().hasLabel("A").repeat().times(2) gets rewritten as
> > > g.V().hasLabel("A").out().out(). Are their subtleties that I am not
> > > aware of or does DFV vs BFS not matter in this case?
> >
> > When I read this I thought: clearly `.out().out()` is DFS for OLTP,
> > that's also what the documentation says, e.g. in this nice infographic
> > http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png
> >
> > However, looks like that's not the case. Has my life been a lie? Setting
> > up a simple flat graph to make things more obvious:
> > v3 <- v1 <- v0 -> v2 -> v4
> >
> > ```
> > graph = TinkerGraph.open()
> > v0 = graph.addVertex("l0")
> > v1 = graph.addVertex("l1")
> > v2 = graph.addVertex("l1")
> > v3 = graph.addVertex("l2")
> > v4 = graph.addVertex("l2")
> > v0.addEdge("e", v2)
> > v2.addEdge("e", v4)
> > v0.addEdge("e", v1)
> > v1.addEdge("e", v3)
> > g = graph.traversal()
> > g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)}
> > ```
> >
> > Prints:
> > v[2]
> > v[1]
> > v[4]
> > ==>v[4]
> > v[3]
> > ==>v[3]
> >
> > If this was OLTP the output would be:
> > v[2]
> > v[4]
> > ==>v[4]
> > v[1]
> > v[3]
> > ==>v[3]
> >
> > Cheers
> > Michael
> >
> > On 18/04/18 02:58, pieter gmail wrote:
> > > Hi,
> > >
> > > I agree with the question about whether this will affect more than just
> > > repeat()?
> > >
> > > I prefer that the semantics of the traversal be specified in the
> > > traversal as a first class citizen. i.e. with order(SearchAlgo).
> > > Strategies are to my mind internal to an implementation. In Robert's
> > > example LazyBarrierStrategy may be replaced/removed by an
> implementation
> > > for whatever internal reason they have.
> > >
> > > Regarding the default, I am fine with any default but am wondering
> > > whether it would be worthwhile for the default to be overridden at a
> > > global level at not just per traversal? That way the impact can also be
> > > alleviated when folks upgrade.
> > >
> > > Also it seems to me that DFS only really applies to repeat() with an
> > > emit().
> > > g.V().hasLabel("A").repeat().times(2) gets rewritten as
> > > g.V().hasLabel("A").out().out(). Are their subtleties that I am not
> > > aware of or does DFV vs BFS not matter in this case?
> > >
> > > Cheers
> > > Pieter
> > >
> > > On 17/04/2018 14:58, Robert Dale wrote:
> > >> +1 on DFS
> > >> -1 on order(SearchAlgo)
> > >>
> > >> It seems like a strategy may be more appropriate.  It could affect
> more
> > >> than just repeat().  And how would this interact with
> > >> LazyBarrierStrategy?
> > >>
> > >> Maybe the default should be DFS with LazyBarrierStrategy. Then
> > >> LazyBarrierStrategy
> > >> can be removed with 'withoutStrategies()' and then it works just like
> > >> everything else.  I'd prefer consistency with the way everything else
> > >> works.
> > >>
> > >>
> > >>
> > >> Robert Dale
> > >>
> > >> On Tue, Apr 17, 2018 at 8:11 AM, Stephen Mallette <
> spmalle...@gmail.com
> > >
> > >> wrote:
> > >>
> > >>> Thanks for the summary. Just a quick note - I'd not worry about the
> GLV
> > >>> tests for now. That part should be easy to sort out. Let's first make
> > >>> sure
> > >>> that we get clear on the other items first before digging too deeply
> > >>> there.
> > >>>
> > >>> On an administrative front, I think that this change should just go
> to
> > >>> 3.4.0/master (so it's currently targeted to the right branch
> > >>> actually) as
> > >>> it sounds like we want DFS to be the default and that could be a
> > >>> breaking
> > >>> change as the semantics of the traversal shift a bit. So that's one
> > >>> issue
> > >>> to make sure we're all happy with.
> > >>>
> > >>> A next issue would be the API from the user perspective - thanks for
> > the
> > >>> link to that JIRA btw - I see I was involved 

Re: [DISCUSS] Depth First Repeat step

2018-04-17 Thread Daniel Kuppitz
TinkerPop makes no guarantees about the order of elements unless you
specify an explicit order. This also goes back to the fact that certain
strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy) add
NoOpBarrierSteps to your traversal, which ultimately turns it into a
DFS/BFS mix. Check the .explain() output of your traversal to see which
strategy adds which steps.

Cheers,
Daniel


On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier <
mich...@michaelpollmeier.com> wrote:

> > Also it seems to me that DFS only really applies to repeat() with an
> > emit().
> > g.V().hasLabel("A").repeat().times(2) gets rewritten as
> > g.V().hasLabel("A").out().out(). Are their subtleties that I am not
> > aware of or does DFV vs BFS not matter in this case?
>
> When I read this I thought: clearly `.out().out()` is DFS for OLTP,
> that's also what the documentation says, e.g. in this nice infographic
> http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png
>
> However, looks like that's not the case. Has my life been a lie? Setting
> up a simple flat graph to make things more obvious:
> v3 <- v1 <- v0 -> v2 -> v4
>
> ```
> graph = TinkerGraph.open()
> v0 = graph.addVertex("l0")
> v1 = graph.addVertex("l1")
> v2 = graph.addVertex("l1")
> v3 = graph.addVertex("l2")
> v4 = graph.addVertex("l2")
> v0.addEdge("e", v2)
> v2.addEdge("e", v4)
> v0.addEdge("e", v1)
> v1.addEdge("e", v3)
> g = graph.traversal()
> g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)}
> ```
>
> Prints:
> v[2]
> v[1]
> v[4]
> ==>v[4]
> v[3]
> ==>v[3]
>
> If this was OLTP the output would be:
> v[2]
> v[4]
> ==>v[4]
> v[1]
> v[3]
> ==>v[3]
>
> Cheers
> Michael
>
> On 18/04/18 02:58, pieter gmail wrote:
> > Hi,
> >
> > I agree with the question about whether this will affect more than just
> > repeat()?
> >
> > I prefer that the semantics of the traversal be specified in the
> > traversal as a first class citizen. i.e. with order(SearchAlgo).
> > Strategies are to my mind internal to an implementation. In Robert's
> > example LazyBarrierStrategy may be replaced/removed by an implementation
> > for whatever internal reason they have.
> >
> > Regarding the default, I am fine with any default but am wondering
> > whether it would be worthwhile for the default to be overridden at a
> > global level at not just per traversal? That way the impact can also be
> > alleviated when folks upgrade.
> >
> > Also it seems to me that DFS only really applies to repeat() with an
> > emit().
> > g.V().hasLabel("A").repeat().times(2) gets rewritten as
> > g.V().hasLabel("A").out().out(). Are their subtleties that I am not
> > aware of or does DFV vs BFS not matter in this case?
> >
> > Cheers
> > Pieter
> >
> > On 17/04/2018 14:58, Robert Dale wrote:
> >> +1 on DFS
> >> -1 on order(SearchAlgo)
> >>
> >> It seems like a strategy may be more appropriate.  It could affect more
> >> than just repeat().  And how would this interact with
> >> LazyBarrierStrategy?
> >>
> >> Maybe the default should be DFS with LazyBarrierStrategy. Then
> >> LazyBarrierStrategy
> >> can be removed with 'withoutStrategies()' and then it works just like
> >> everything else.  I'd prefer consistency with the way everything else
> >> works.
> >>
> >>
> >>
> >> Robert Dale
> >>
> >> On Tue, Apr 17, 2018 at 8:11 AM, Stephen Mallette  >
> >> wrote:
> >>
> >>> Thanks for the summary. Just a quick note - I'd not worry about the GLV
> >>> tests for now. That part should be easy to sort out. Let's first make
> >>> sure
> >>> that we get clear on the other items first before digging too deeply
> >>> there.
> >>>
> >>> On an administrative front, I think that this change should just go to
> >>> 3.4.0/master (so it's currently targeted to the right branch
> >>> actually) as
> >>> it sounds like we want DFS to be the default and that could be a
> >>> breaking
> >>> change as the semantics of the traversal shift a bit. So that's one
> >>> issue
> >>> to make sure we're all happy with.
> >>>
> >>> A next issue would be the API from the user perspective - thanks for
> the
> >>> link to that JIRA btw - I see I was involved in that conversation a
> >>> little
> >>> bit but then dropped off. You'd commented that you preferred a per loop
> >>> configuration for search approach which is what you stuck with for the
> >>> implementation. I guess I can see that there could be some value in
> >>> having
> >>> that ability. That said, I'm not sure that I like the overload of
> >>> order()
> >>> as the method for doing that:
> >>>
> >>> g.V().has("name", "marko").
> >>> repeat(__.outE().order().by("weight", decr).inV()).
> >>>   emit().
> >>>   order(SearchAlgo.DFS)
> >>>
> >>> Still thinking about what else we might do there, but perhaps that
> >>> can wait
> >>> a bit as I still need to study your changes in detail. Regarding:
> >>>
>    The barrier step that Daniel described doesn’t currently work since
> >>> there’s basically booleans in the 

Re: [DISCUSS] Depth First Repeat step

2018-04-17 Thread Michael Pollmeier
> Also it seems to me that DFS only really applies to repeat() with an
> emit().
> g.V().hasLabel("A").repeat().times(2) gets rewritten as
> g.V().hasLabel("A").out().out(). Are their subtleties that I am not
> aware of or does DFV vs BFS not matter in this case?

When I read this I thought: clearly `.out().out()` is DFS for OLTP,
that's also what the documentation says, e.g. in this nice infographic
http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png

However, looks like that's not the case. Has my life been a lie? Setting
up a simple flat graph to make things more obvious:
v3 <- v1 <- v0 -> v2 -> v4

```
graph = TinkerGraph.open()
v0 = graph.addVertex("l0")
v1 = graph.addVertex("l1")
v2 = graph.addVertex("l1")
v3 = graph.addVertex("l2")
v4 = graph.addVertex("l2")
v0.addEdge("e", v2)
v2.addEdge("e", v4)
v0.addEdge("e", v1)
v1.addEdge("e", v3)
g = graph.traversal()
g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)}
```

Prints:
v[2]
v[1]
v[4]
==>v[4]
v[3]
==>v[3]

If this was OLTP the output would be:
v[2]
v[4]
==>v[4]
v[1]
v[3]
==>v[3]

Cheers
Michael

On 18/04/18 02:58, pieter gmail wrote:
> Hi,
> 
> I agree with the question about whether this will affect more than just
> repeat()?
> 
> I prefer that the semantics of the traversal be specified in the
> traversal as a first class citizen. i.e. with order(SearchAlgo).
> Strategies are to my mind internal to an implementation. In Robert's
> example LazyBarrierStrategy may be replaced/removed by an implementation
> for whatever internal reason they have.
> 
> Regarding the default, I am fine with any default but am wondering
> whether it would be worthwhile for the default to be overridden at a
> global level at not just per traversal? That way the impact can also be
> alleviated when folks upgrade.
> 
> Also it seems to me that DFS only really applies to repeat() with an
> emit().
> g.V().hasLabel("A").repeat().times(2) gets rewritten as
> g.V().hasLabel("A").out().out(). Are their subtleties that I am not
> aware of or does DFV vs BFS not matter in this case?
> 
> Cheers
> Pieter
> 
> On 17/04/2018 14:58, Robert Dale wrote:
>> +1 on DFS
>> -1 on order(SearchAlgo)
>>
>> It seems like a strategy may be more appropriate.  It could affect more
>> than just repeat().  And how would this interact with
>> LazyBarrierStrategy?
>>
>> Maybe the default should be DFS with LazyBarrierStrategy. Then
>> LazyBarrierStrategy
>> can be removed with 'withoutStrategies()' and then it works just like
>> everything else.  I'd prefer consistency with the way everything else
>> works.
>>
>>
>>
>> Robert Dale
>>
>> On Tue, Apr 17, 2018 at 8:11 AM, Stephen Mallette 
>> wrote:
>>
>>> Thanks for the summary. Just a quick note - I'd not worry about the GLV
>>> tests for now. That part should be easy to sort out. Let's first make
>>> sure
>>> that we get clear on the other items first before digging too deeply
>>> there.
>>>
>>> On an administrative front, I think that this change should just go to
>>> 3.4.0/master (so it's currently targeted to the right branch
>>> actually) as
>>> it sounds like we want DFS to be the default and that could be a
>>> breaking
>>> change as the semantics of the traversal shift a bit. So that's one
>>> issue
>>> to make sure we're all happy with.
>>>
>>> A next issue would be the API from the user perspective - thanks for the
>>> link to that JIRA btw - I see I was involved in that conversation a
>>> little
>>> bit but then dropped off. You'd commented that you preferred a per loop
>>> configuration for search approach which is what you stuck with for the
>>> implementation. I guess I can see that there could be some value in
>>> having
>>> that ability. That said, I'm not sure that I like the overload of
>>> order()
>>> as the method for doing that:
>>>
>>> g.V().has("name", "marko").
>>>     repeat(__.outE().order().by("weight", decr).inV()).
>>>   emit().
>>>   order(SearchAlgo.DFS)
>>>
>>> Still thinking about what else we might do there, but perhaps that
>>> can wait
>>> a bit as I still need to study your changes in detail. Regarding:
>>>
   The barrier step that Daniel described doesn’t currently work since
>>> there’s basically booleans in the RepeatStep on whether or not to
>>> stash the
>>> starts to make the RepeatStep depth first.
>>>
>>> I assume you are referring to these booleans:
>>>
>>> https://github.com/apache/tinkerpop/blob/fe8ee98f6967c7b0f0ee7f2cf2d105
>>> 31b68fab8b/gremlin-core/src/main/java/org/apache/
>>> tinkerpop/gremlin/process/traversal/step/branch/RepeatStep.java#L46-L47
>>>
>>> ?
>>>
>>>
>>>
>>> On Tue, Apr 17, 2018 at 7:37 AM, Keith Lohnes  wrote:
>>>
 Stephen, That’s a fair summary. I had an immediate need for it, so I
 developed something based on Michel Pollmeier’s work and a modification
>>> to
 the syntax Pieter Martin suggested in the Jira
 

Re: [DISCUSS] Depth First Repeat step

2018-04-17 Thread pieter gmail

Hi,

I agree with the question about whether this will affect more than just 
repeat()?


I prefer that the semantics of the traversal be specified in the 
traversal as a first class citizen. i.e. with order(SearchAlgo).
Strategies are to my mind internal to an implementation. In Robert's 
example LazyBarrierStrategy may be replaced/removed by an implementation 
for whatever internal reason they have.


Regarding the default, I am fine with any default but am wondering 
whether it would be worthwhile for the default to be overridden at a 
global level at not just per traversal? That way the impact can also be 
alleviated when folks upgrade.


Also it seems to me that DFS only really applies to repeat() with an 
emit().
g.V().hasLabel("A").repeat().times(2) gets rewritten as 
g.V().hasLabel("A").out().out(). Are their subtleties that I am not 
aware of or does DFV vs BFS not matter in this case?


Cheers
Pieter

On 17/04/2018 14:58, Robert Dale wrote:

+1 on DFS
-1 on order(SearchAlgo)

It seems like a strategy may be more appropriate.  It could affect more
than just repeat().  And how would this interact with LazyBarrierStrategy?

Maybe the default should be DFS with LazyBarrierStrategy. Then
LazyBarrierStrategy
can be removed with 'withoutStrategies()' and then it works just like
everything else.  I'd prefer consistency with the way everything else works.



Robert Dale

On Tue, Apr 17, 2018 at 8:11 AM, Stephen Mallette 
wrote:


Thanks for the summary. Just a quick note - I'd not worry about the GLV
tests for now. That part should be easy to sort out. Let's first make sure
that we get clear on the other items first before digging too deeply there.

On an administrative front, I think that this change should just go to
3.4.0/master (so it's currently targeted to the right branch actually) as
it sounds like we want DFS to be the default and that could be a breaking
change as the semantics of the traversal shift a bit. So that's one issue
to make sure we're all happy with.

A next issue would be the API from the user perspective - thanks for the
link to that JIRA btw - I see I was involved in that conversation a little
bit but then dropped off. You'd commented that you preferred a per loop
configuration for search approach which is what you stuck with for the
implementation. I guess I can see that there could be some value in having
that ability. That said, I'm not sure that I like the overload of order()
as the method for doing that:

g.V().has("name", "marko").
repeat(__.outE().order().by("weight", decr).inV()).
  emit().
  order(SearchAlgo.DFS)

Still thinking about what else we might do there, but perhaps that can wait
a bit as I still need to study your changes in detail. Regarding:


  The barrier step that Daniel described doesn’t currently work since

there’s basically booleans in the RepeatStep on whether or not to stash the
starts to make the RepeatStep depth first.

I assume you are referring to these booleans:

https://github.com/apache/tinkerpop/blob/fe8ee98f6967c7b0f0ee7f2cf2d105
31b68fab8b/gremlin-core/src/main/java/org/apache/
tinkerpop/gremlin/process/traversal/step/branch/RepeatStep.java#L46-L47

?



On Tue, Apr 17, 2018 at 7:37 AM, Keith Lohnes  wrote:


Stephen, That’s a fair summary. I had an immediate need for it, so I
developed something based on Michel Pollmeier’s work and a modification

to

the syntax Pieter Martin suggested in the Jira

with DFS being explicit.

The semantics I used were g.V().repeat(out()).order(DFS), putting the
order
clause outside of the repeat. It made it simpler for me to develop and
seemed nice to make that DFS vs BFS explicit. The barrier step that

Daniel

described doesn’t currently work since there’s basically booleans in the
RepeatStep on whether or not to stash the starts to make the RepeatStep
depth first.

I added a test for the DFS, though I’ve had some issues getting things to
run re: .net tests and some other tests timing out. I have some more

tests

I'd like to write based off issues that I ran into testing this in the
console (k-ary trees, until/emit before the repeat vs after), but I

really

wanted to get this out there for people to take a look at and see if it
could work out.
​

On Mon, Apr 16, 2018 at 7:29 PM Stephen Mallette 
wrote:


  Keith, I have to admit that I didn't really follow this general body

of

work closely, which include this pull request, the earlier one from

Michael

Pollmeier, the JIRA and I think some discussion somewhere on one of the
mailing lists. As best I have gathered from what I did follow, I feel

like

the focus of this work so far was mostly related to "can someone get it

to

actually work", but not a lot about other aspects like testing, api,
release branch to apply 

Re: [DISCUSS] Depth First Repeat step

2018-04-17 Thread Robert Dale
+1 on DFS
-1 on order(SearchAlgo)

It seems like a strategy may be more appropriate.  It could affect more
than just repeat().  And how would this interact with LazyBarrierStrategy?

Maybe the default should be DFS with LazyBarrierStrategy. Then
LazyBarrierStrategy
can be removed with 'withoutStrategies()' and then it works just like
everything else.  I'd prefer consistency with the way everything else works.



Robert Dale

On Tue, Apr 17, 2018 at 8:11 AM, Stephen Mallette 
wrote:

> Thanks for the summary. Just a quick note - I'd not worry about the GLV
> tests for now. That part should be easy to sort out. Let's first make sure
> that we get clear on the other items first before digging too deeply there.
>
> On an administrative front, I think that this change should just go to
> 3.4.0/master (so it's currently targeted to the right branch actually) as
> it sounds like we want DFS to be the default and that could be a breaking
> change as the semantics of the traversal shift a bit. So that's one issue
> to make sure we're all happy with.
>
> A next issue would be the API from the user perspective - thanks for the
> link to that JIRA btw - I see I was involved in that conversation a little
> bit but then dropped off. You'd commented that you preferred a per loop
> configuration for search approach which is what you stuck with for the
> implementation. I guess I can see that there could be some value in having
> that ability. That said, I'm not sure that I like the overload of order()
> as the method for doing that:
>
> g.V().has("name", "marko").
>repeat(__.outE().order().by("weight", decr).inV()).
>  emit().
>  order(SearchAlgo.DFS)
>
> Still thinking about what else we might do there, but perhaps that can wait
> a bit as I still need to study your changes in detail. Regarding:
>
> >  The barrier step that Daniel described doesn’t currently work since
> there’s basically booleans in the RepeatStep on whether or not to stash the
> starts to make the RepeatStep depth first.
>
> I assume you are referring to these booleans:
>
> https://github.com/apache/tinkerpop/blob/fe8ee98f6967c7b0f0ee7f2cf2d105
> 31b68fab8b/gremlin-core/src/main/java/org/apache/
> tinkerpop/gremlin/process/traversal/step/branch/RepeatStep.java#L46-L47
>
> ?
>
>
>
> On Tue, Apr 17, 2018 at 7:37 AM, Keith Lohnes  wrote:
>
> > Stephen, That’s a fair summary. I had an immediate need for it, so I
> > developed something based on Michel Pollmeier’s work and a modification
> to
> > the syntax Pieter Martin suggested in the Jira
> >  > focusedCommentId=16233681=com.atlassian.jira.
> > plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16233681>
> > with DFS being explicit.
> >
> > The semantics I used were g.V().repeat(out()).order(DFS), putting the
> > order
> > clause outside of the repeat. It made it simpler for me to develop and
> > seemed nice to make that DFS vs BFS explicit. The barrier step that
> Daniel
> > described doesn’t currently work since there’s basically booleans in the
> > RepeatStep on whether or not to stash the starts to make the RepeatStep
> > depth first.
> >
> > I added a test for the DFS, though I’ve had some issues getting things to
> > run re: .net tests and some other tests timing out. I have some more
> tests
> > I'd like to write based off issues that I ran into testing this in the
> > console (k-ary trees, until/emit before the repeat vs after), but I
> really
> > wanted to get this out there for people to take a look at and see if it
> > could work out.
> > ​
> >
> > On Mon, Apr 16, 2018 at 7:29 PM Stephen Mallette 
> > wrote:
> >
> > >  Keith, I have to admit that I didn't really follow this general body
> of
> > > work closely, which include this pull request, the earlier one from
> > Michael
> > > Pollmeier, the JIRA and I think some discussion somewhere on one of the
> > > mailing lists. As best I have gathered from what I did follow, I feel
> > like
> > > the focus of this work so far was mostly related to "can someone get it
> > to
> > > actually work", but not a lot about other aspects like testing, api,
> > > release branch to apply it to, etc. Is that a fair depiction of how
> this
> > > work has developed so far? if so, let's use this thread to make sure
> > we're
> > > all on the same page as to how this change will get in on all those
> sorts
> > > of issues.
> > >
> > > btw, thanks to you and michael for sticking at this to come to
> something
> > > that seems to work. looking forward to the continued discussion on this
> > > thread.
> > >
> > >
> > > On Mon, Apr 16, 2018 at 6:54 PM, Michael Pollmeier <
> > > mich...@michaelpollmeier.com> wrote:
> > >
> > > > Unsurprisingly I'm also +1 for defaulting to DFS in OLTP. My feeling
> is
> > > > that most users currently expect it to be DFS since that's what the
> > docs
> > > > say.
> > > >
> > > > And yes, it's easy 

Re: [DISCUSS] Depth First Repeat step

2018-04-17 Thread Stephen Mallette
Thanks for the summary. Just a quick note - I'd not worry about the GLV
tests for now. That part should be easy to sort out. Let's first make sure
that we get clear on the other items first before digging too deeply there.

On an administrative front, I think that this change should just go to
3.4.0/master (so it's currently targeted to the right branch actually) as
it sounds like we want DFS to be the default and that could be a breaking
change as the semantics of the traversal shift a bit. So that's one issue
to make sure we're all happy with.

A next issue would be the API from the user perspective - thanks for the
link to that JIRA btw - I see I was involved in that conversation a little
bit but then dropped off. You'd commented that you preferred a per loop
configuration for search approach which is what you stuck with for the
implementation. I guess I can see that there could be some value in having
that ability. That said, I'm not sure that I like the overload of order()
as the method for doing that:

g.V().has("name", "marko").
   repeat(__.outE().order().by("weight", decr).inV()).
 emit().
 order(SearchAlgo.DFS)

Still thinking about what else we might do there, but perhaps that can wait
a bit as I still need to study your changes in detail. Regarding:

>  The barrier step that Daniel described doesn’t currently work since
there’s basically booleans in the RepeatStep on whether or not to stash the
starts to make the RepeatStep depth first.

I assume you are referring to these booleans:

https://github.com/apache/tinkerpop/blob/fe8ee98f6967c7b0f0ee7f2cf2d10531b68fab8b/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/branch/RepeatStep.java#L46-L47

?



On Tue, Apr 17, 2018 at 7:37 AM, Keith Lohnes  wrote:

> Stephen, That’s a fair summary. I had an immediate need for it, so I
> developed something based on Michel Pollmeier’s work and a modification to
> the syntax Pieter Martin suggested in the Jira
>  focusedCommentId=16233681=com.atlassian.jira.
> plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16233681>
> with DFS being explicit.
>
> The semantics I used were g.V().repeat(out()).order(DFS), putting the
> order
> clause outside of the repeat. It made it simpler for me to develop and
> seemed nice to make that DFS vs BFS explicit. The barrier step that Daniel
> described doesn’t currently work since there’s basically booleans in the
> RepeatStep on whether or not to stash the starts to make the RepeatStep
> depth first.
>
> I added a test for the DFS, though I’ve had some issues getting things to
> run re: .net tests and some other tests timing out. I have some more tests
> I'd like to write based off issues that I ran into testing this in the
> console (k-ary trees, until/emit before the repeat vs after), but I really
> wanted to get this out there for people to take a look at and see if it
> could work out.
> ​
>
> On Mon, Apr 16, 2018 at 7:29 PM Stephen Mallette 
> wrote:
>
> >  Keith, I have to admit that I didn't really follow this general body of
> > work closely, which include this pull request, the earlier one from
> Michael
> > Pollmeier, the JIRA and I think some discussion somewhere on one of the
> > mailing lists. As best I have gathered from what I did follow, I feel
> like
> > the focus of this work so far was mostly related to "can someone get it
> to
> > actually work", but not a lot about other aspects like testing, api,
> > release branch to apply it to, etc. Is that a fair depiction of how this
> > work has developed so far? if so, let's use this thread to make sure
> we're
> > all on the same page as to how this change will get in on all those sorts
> > of issues.
> >
> > btw, thanks to you and michael for sticking at this to come to something
> > that seems to work. looking forward to the continued discussion on this
> > thread.
> >
> >
> > On Mon, Apr 16, 2018 at 6:54 PM, Michael Pollmeier <
> > mich...@michaelpollmeier.com> wrote:
> >
> > > Unsurprisingly I'm also +1 for defaulting to DFS in OLTP. My feeling is
> > > that most users currently expect it to be DFS since that's what the
> docs
> > > say.
> > >
> > > And yes, it's easy to verify the default in the test suite, once we
> > > agreed on what the default should be.
> > >
> > > Cheers
> > > Michael
> > >
> > > On 17/04/18 04:40, pieter gmail wrote:
> > > > Hi,
> > > >
> > > > I have not properly followed the previous thread. However I thought
> is
> > > > going to be a way to set the default behavior as apposed to needing
> to
> > > > use barrier()
> > > > Is this the case or not?
> > > >
> > > > For Sqlg at least it is possible to optimize BFS much more
> effectively
> > > > than DFS so it will be nice to have a way to set the strategy rather
> > > > than having to manually inject barriers.
> > > >
> > > > Is the test suite going to enforce the BFS vs DFS?
> > > >
> > > > 

Re: [DISCUSS] Depth First Repeat step

2018-04-17 Thread Keith Lohnes
Stephen, That’s a fair summary. I had an immediate need for it, so I
developed something based on Michel Pollmeier’s work and a modification to
the syntax Pieter Martin suggested in the Jira

with DFS being explicit.

The semantics I used were g.V().repeat(out()).order(DFS), putting the order
clause outside of the repeat. It made it simpler for me to develop and
seemed nice to make that DFS vs BFS explicit. The barrier step that Daniel
described doesn’t currently work since there’s basically booleans in the
RepeatStep on whether or not to stash the starts to make the RepeatStep
depth first.

I added a test for the DFS, though I’ve had some issues getting things to
run re: .net tests and some other tests timing out. I have some more tests
I'd like to write based off issues that I ran into testing this in the
console (k-ary trees, until/emit before the repeat vs after), but I really
wanted to get this out there for people to take a look at and see if it
could work out.
​

On Mon, Apr 16, 2018 at 7:29 PM Stephen Mallette 
wrote:

>  Keith, I have to admit that I didn't really follow this general body of
> work closely, which include this pull request, the earlier one from Michael
> Pollmeier, the JIRA and I think some discussion somewhere on one of the
> mailing lists. As best I have gathered from what I did follow, I feel like
> the focus of this work so far was mostly related to "can someone get it to
> actually work", but not a lot about other aspects like testing, api,
> release branch to apply it to, etc. Is that a fair depiction of how this
> work has developed so far? if so, let's use this thread to make sure we're
> all on the same page as to how this change will get in on all those sorts
> of issues.
>
> btw, thanks to you and michael for sticking at this to come to something
> that seems to work. looking forward to the continued discussion on this
> thread.
>
>
> On Mon, Apr 16, 2018 at 6:54 PM, Michael Pollmeier <
> mich...@michaelpollmeier.com> wrote:
>
> > Unsurprisingly I'm also +1 for defaulting to DFS in OLTP. My feeling is
> > that most users currently expect it to be DFS since that's what the docs
> > say.
> >
> > And yes, it's easy to verify the default in the test suite, once we
> > agreed on what the default should be.
> >
> > Cheers
> > Michael
> >
> > On 17/04/18 04:40, pieter gmail wrote:
> > > Hi,
> > >
> > > I have not properly followed the previous thread. However I thought is
> > > going to be a way to set the default behavior as apposed to needing to
> > > use barrier()
> > > Is this the case or not?
> > >
> > > For Sqlg at least it is possible to optimize BFS much more effectively
> > > than DFS so it will be nice to have a way to set the strategy rather
> > > than having to manually inject barriers.
> > >
> > > Is the test suite going to enforce the BFS vs DFS?
> > >
> > > Thanks
> > > Pieter
> > >
> > > On 16/04/2018 16:56, Daniel Kuppitz wrote:
> > >> +1 for DFS. If the query relies on BFS, you can always do
> > >> .repeat(barrier())...
> > >>
> > >> ^ This holds true as long as there's no significant difference in the
> > >> cpu+memory consumption and overall performance of the two approaches.
> > BFS
> > >> has its advantages when it comes to bulking; an arbitrary number of
> > >> traversers on the same element is processed at the same pace as a
> single
> > >> traverser. I don't think we can benefit from bulking in DFS.
> > >>
> > >> Cheers,
> > >> Daniel
> > >>
> > >>
> > >> On Mon, Apr 16, 2018 at 5:44 AM, Keith Lohnes 
> > wrote:
> > >>
> > >>> As part of #838 
> there’s
> > >>> some
> > >>> discussion around whether or not to make DFS the default for the
> repeat
> > >>> step. On the one hand, everything else in OLTP is depth first. On the
> > >>> other
> > >>> hand, there’s likely existing traversals that depend on the breadth
> > >>> first
> > >>> nature of repeat. My general preference is to make DFS optional at
> > >>> first,
> > >>> and at some later date, change the default and have that be a
> separate
> > >>> change from implementing DFS for repeat
> > >>> ​
> > >>>
> > >
> > >
> >
> >
>


Re: [DISCUSS] Depth First Repeat step

2018-04-16 Thread Stephen Mallette
 Keith, I have to admit that I didn't really follow this general body of
work closely, which include this pull request, the earlier one from Michael
Pollmeier, the JIRA and I think some discussion somewhere on one of the
mailing lists. As best I have gathered from what I did follow, I feel like
the focus of this work so far was mostly related to "can someone get it to
actually work", but not a lot about other aspects like testing, api,
release branch to apply it to, etc. Is that a fair depiction of how this
work has developed so far? if so, let's use this thread to make sure we're
all on the same page as to how this change will get in on all those sorts
of issues.

btw, thanks to you and michael for sticking at this to come to something
that seems to work. looking forward to the continued discussion on this
thread.


On Mon, Apr 16, 2018 at 6:54 PM, Michael Pollmeier <
mich...@michaelpollmeier.com> wrote:

> Unsurprisingly I'm also +1 for defaulting to DFS in OLTP. My feeling is
> that most users currently expect it to be DFS since that's what the docs
> say.
>
> And yes, it's easy to verify the default in the test suite, once we
> agreed on what the default should be.
>
> Cheers
> Michael
>
> On 17/04/18 04:40, pieter gmail wrote:
> > Hi,
> >
> > I have not properly followed the previous thread. However I thought is
> > going to be a way to set the default behavior as apposed to needing to
> > use barrier()
> > Is this the case or not?
> >
> > For Sqlg at least it is possible to optimize BFS much more effectively
> > than DFS so it will be nice to have a way to set the strategy rather
> > than having to manually inject barriers.
> >
> > Is the test suite going to enforce the BFS vs DFS?
> >
> > Thanks
> > Pieter
> >
> > On 16/04/2018 16:56, Daniel Kuppitz wrote:
> >> +1 for DFS. If the query relies on BFS, you can always do
> >> .repeat(barrier())...
> >>
> >> ^ This holds true as long as there's no significant difference in the
> >> cpu+memory consumption and overall performance of the two approaches.
> BFS
> >> has its advantages when it comes to bulking; an arbitrary number of
> >> traversers on the same element is processed at the same pace as a single
> >> traverser. I don't think we can benefit from bulking in DFS.
> >>
> >> Cheers,
> >> Daniel
> >>
> >>
> >> On Mon, Apr 16, 2018 at 5:44 AM, Keith Lohnes 
> wrote:
> >>
> >>> As part of #838  there’s
> >>> some
> >>> discussion around whether or not to make DFS the default for the repeat
> >>> step. On the one hand, everything else in OLTP is depth first. On the
> >>> other
> >>> hand, there’s likely existing traversals that depend on the breadth
> >>> first
> >>> nature of repeat. My general preference is to make DFS optional at
> >>> first,
> >>> and at some later date, change the default and have that be a separate
> >>> change from implementing DFS for repeat
> >>> ​
> >>>
> >
> >
>
>


Re: [DISCUSS] Depth First Repeat step

2018-04-16 Thread pieter gmail

Hi,

I have not properly followed the previous thread. However I thought is 
going to be a way to set the default behavior as apposed to needing to 
use barrier()

Is this the case or not?

For Sqlg at least it is possible to optimize BFS much more effectively 
than DFS so it will be nice to have a way to set the strategy rather 
than having to manually inject barriers.


Is the test suite going to enforce the BFS vs DFS?

Thanks
Pieter

On 16/04/2018 16:56, Daniel Kuppitz wrote:

+1 for DFS. If the query relies on BFS, you can always do
.repeat(barrier())...

^ This holds true as long as there's no significant difference in the
cpu+memory consumption and overall performance of the two approaches. BFS
has its advantages when it comes to bulking; an arbitrary number of
traversers on the same element is processed at the same pace as a single
traverser. I don't think we can benefit from bulking in DFS.

Cheers,
Daniel


On Mon, Apr 16, 2018 at 5:44 AM, Keith Lohnes  wrote:


As part of #838  there’s
some
discussion around whether or not to make DFS the default for the repeat
step. On the one hand, everything else in OLTP is depth first. On the other
hand, there’s likely existing traversals that depend on the breadth first
nature of repeat. My general preference is to make DFS optional at first,
and at some later date, change the default and have that be a separate
change from implementing DFS for repeat
​





Re: [DISCUSS] Depth First Repeat step

2018-04-16 Thread Daniel Kuppitz
+1 for DFS. If the query relies on BFS, you can always do
.repeat(barrier())...

^ This holds true as long as there's no significant difference in the
cpu+memory consumption and overall performance of the two approaches. BFS
has its advantages when it comes to bulking; an arbitrary number of
traversers on the same element is processed at the same pace as a single
traverser. I don't think we can benefit from bulking in DFS.

Cheers,
Daniel


On Mon, Apr 16, 2018 at 5:44 AM, Keith Lohnes  wrote:

> As part of #838  there’s
> some
> discussion around whether or not to make DFS the default for the repeat
> step. On the one hand, everything else in OLTP is depth first. On the other
> hand, there’s likely existing traversals that depend on the breadth first
> nature of repeat. My general preference is to make DFS optional at first,
> and at some later date, change the default and have that be a separate
> change from implementing DFS for repeat
> ​
>


[DISCUSS] Depth First Repeat step

2018-04-16 Thread Keith Lohnes
As part of #838  there’s some
discussion around whether or not to make DFS the default for the repeat
step. On the one hand, everything else in OLTP is depth first. On the other
hand, there’s likely existing traversals that depend on the breadth first
nature of repeat. My general preference is to make DFS optional at first,
and at some later date, change the default and have that be a separate
change from implementing DFS for repeat
​