Re: [DISCUSS] Depth First Repeat step

2018-04-28 Thread Stephen Mallette
>  Is the objection to order(SearchAlgo) that it overloads order() or an
objection to specifying, DFS/BFS in the traversal itself?

for me, order() is a step. not a modulator. making it work as presented in
the pull request conflates those two concepts which i think is confusing (i
don't think we do that anywhere else, but feel free to correct me if i'm
wrong).

>  As mentioned earlier I am not a fan of making the user deal with
strategies.  Strategies to my mind is not part of the gremlin as a language
specification

There are already precedents for users dealing with strategies. Consider
the decorations like SubgraphStrategy, EventStrategy, etc. So, it is not
foreign or hidden. It is a component of traversal configuration.

I would have to think about a new step like traverse(). I'd prefer to avoid
new steps when we already have other infrastructure from the language that
can support what we're trying to do. I think that's why I like the approach
I outlined in my last post - it just sorta fits with what we have.



On Sat, Apr 28, 2018 at 1:37 PM, pieter gmail 
wrote:

> Hi,
>
> Is the objection to order(SearchAlgo) that it overloads order() or an
> objection to specifying, DFS/BFS in the traversal itself?
> If so I do not really see how it is misplaced from a usability/API
> perspective. Seems pretty natural to me and very graphy at that.
>
> As mentioned earlier I am not a fan of making the user deal with
> strategies. Strategies to my mind is not part of the gremlin as a language
> specification. You mention LazyBarrierStrategy yet it is a internal
> strategy to TinkerPop. Sqlg removes it and passes all tests without me even
> being aware anymore if TinkerPop would have had a LazyBarrierStrategy or
> not. I see this as a strength of TinkerPop. Strategies can evolve over time
> without it having a effect on the specification and user's codebase out
> there.
>
> Here comes where my thoughts have gone to,
>
> g.V().anythingTraversal(anyTraversal()).traverse(DFS/BFS)
>
> This should apply to all traversals, not just RepeatStep. e.g.
> g.V().out().out().traverse(DFS).
> If no traverse(DFS/BFS) is specified then the graph provider can do
> whatever they want, DFS, BFS or a mix of the two.
> If however traverse(BFS/DFS) is specified then the graph provider must
> follow the directive and the test suite will test for it.
>
> traverse(DFS/BFS) is like any other traversal in that it can be specified
> mid traversal. Everything before it must follow the directive, everything
> after need not to.
>
> This will also be backward compatible which is kinda nice.
>
> Cheers
> Pieter
>
> On 27/04/2018 21:03, Stephen Mallette wrote:
>
>> It seems like we have general agreement on the easy things, that is:
>>
>> 1. this is a change for 3.4.0/master and
>> 2. we're all for a DFS option
>>
>> but we still have the hard part of having to come to consensus on how it's
>> used/implemented. The quick summary of this thread in that regard goes
>> something like this: We currently have this PR that introduces DFS, but
>> does so as a configuration per repeat() step. From what I gather a good
>> many of us seem to find that approach undesirable for one or more of the
>> following reasons:
>>
>> 1. The use of order() seems misplaced purely from a usability/API
>> perspective
>> 2. The approach seems to be at odds with how everything else works given
>> barrier() and strategies
>> 3. The approach seems to be at odds with our current mixed mode of DFS/BFS
>>
>> I think that we can see those issues resolve themselves with something
>> Kuppitz mentioned to me: repeat() should be DFS by default where barrier()
>> will change that behavior as required. That change would yield the
>> following approaches:
>>
>> Full BFS: manually add `barrier()`'s
>> Mixed mode: Default, let strategies do their thing OR remove strategies
>> and
>> manually add your own barrier()
>> Full DFS: execute `.withoutStrategies(Lazy...)`
>>
>> Futherrmore, we probably should have some form of verification strategy
>> that ensures all BFS or all DFS so that users can't get tricked along the
>> way. It's not enough to just remove LazyBarrierStrategy to get DFS if
>> another strategy comes along and throws in a barrier().
>>
>> So if all that sounds good from a usability perspective, then we get all
>> three modes that we want using existing traversal semantics which removes
>> the three concerns I've summarized from this thread. We also get Keith's
>> desire to have control over which part of a traversal is BFS/DFS if users
>> want that capability because they can do a manual Mixed Mode and add their
>> own barrier() to control the flow. For Pieter (or any graph provider)
>> nothing really changes and there is opportunity to control flow with
>> strategies as usual.
>>
>> I haven't really focused much on what's involved in adapting the current
>> work in the PR to this approach as I more wanted to find the common ground
>> among all the people who commented on the 

Re: [DISCUSS] Depth First Repeat step

2018-04-28 Thread pieter gmail

Hi,

Is the objection to order(SearchAlgo) that it overloads order() or an 
objection to specifying, DFS/BFS in the traversal itself?
If so I do not really see how it is misplaced from a usability/API 
perspective. Seems pretty natural to me and very graphy at that.


As mentioned earlier I am not a fan of making the user deal with 
strategies. Strategies to my mind is not part of the gremlin as a 
language specification. You mention LazyBarrierStrategy yet it is a 
internal strategy to TinkerPop. Sqlg removes it and passes all tests 
without me even being aware anymore if TinkerPop would have had a 
LazyBarrierStrategy or not. I see this as a strength of TinkerPop. 
Strategies can evolve over time without it having a effect on the 
specification and user's codebase out there.


Here comes where my thoughts have gone to,

g.V().anythingTraversal(anyTraversal()).traverse(DFS/BFS)

This should apply to all traversals, not just RepeatStep. e.g. 
g.V().out().out().traverse(DFS).
If no traverse(DFS/BFS) is specified then the graph provider can do 
whatever they want, DFS, BFS or a mix of the two.
If however traverse(BFS/DFS) is specified then the graph provider must 
follow the directive and the test suite will test for it.


traverse(DFS/BFS) is like any other traversal in that it can be 
specified mid traversal. Everything before it must follow the directive, 
everything after need not to.


This will also be backward compatible which is kinda nice.

Cheers
Pieter

On 27/04/2018 21:03, Stephen Mallette wrote:

It seems like we have general agreement on the easy things, that is:

1. this is a change for 3.4.0/master and
2. we're all for a DFS option

but we still have the hard part of having to come to consensus on how it's
used/implemented. The quick summary of this thread in that regard goes
something like this: We currently have this PR that introduces DFS, but
does so as a configuration per repeat() step. From what I gather a good
many of us seem to find that approach undesirable for one or more of the
following reasons:

1. The use of order() seems misplaced purely from a usability/API
perspective
2. The approach seems to be at odds with how everything else works given
barrier() and strategies
3. The approach seems to be at odds with our current mixed mode of DFS/BFS

I think that we can see those issues resolve themselves with something
Kuppitz mentioned to me: repeat() should be DFS by default where barrier()
will change that behavior as required. That change would yield the
following approaches:

Full BFS: manually add `barrier()`'s
Mixed mode: Default, let strategies do their thing OR remove strategies and
manually add your own barrier()
Full DFS: execute `.withoutStrategies(Lazy...)`

Futherrmore, we probably should have some form of verification strategy
that ensures all BFS or all DFS so that users can't get tricked along the
way. It's not enough to just remove LazyBarrierStrategy to get DFS if
another strategy comes along and throws in a barrier().

So if all that sounds good from a usability perspective, then we get all
three modes that we want using existing traversal semantics which removes
the three concerns I've summarized from this thread. We also get Keith's
desire to have control over which part of a traversal is BFS/DFS if users
want that capability because they can do a manual Mixed Mode and add their
own barrier() to control the flow. For Pieter (or any graph provider)
nothing really changes and there is opportunity to control flow with
strategies as usual.

I haven't really focused much on what's involved in adapting the current
work in the PR to this approach as I more wanted to find the common ground
among all the people who commented on the thread. If we agree that this is
a nice way to go, then we can think more about "how" it could happen.

Keith, I saw you mention earlier that:


  The barrier step that Daniel described doesn’t currently work since

there’s basically booleans in the RepeatStep on whether or not to stash the
starts to make the RepeatStep depth first.

I presume that would be some source of technical derailment to this
approach.




On Tue, Apr 24, 2018 at 3:05 PM, Keith Lohnes  wrote:


Yeah, that's what I meant. The steps inside are replaced with some
JanusGraph stuff.

Cheers,
Keith


On Tue, Apr 24, 2018 at 1:52 PM pieter gmail 
wrote:


Nah, that looks to me like the RepeatStep survived. Just the nested
VertexStep that got replaced with JanusgraphVertexStep.
Good for them, first prize is not replacing anything.

Cheers
Pieter

On 24/04/2018 19:50, Keith Lohnes wrote:

It looks like it,
`g.V().has("foo", "bar").repeat(out()).emit().explain()`

yields

`[JanusGraphStep([],[foo.eq(bar)]),
RepeatStep([JanusGraphVertexStep(OUT,vertex),
RepeatEndStep],until(false),emit(true))]`



On Tue, Apr 24, 2018 at 12:12 PM pieter gmail 
Hi,

Sqlg completely replaces TinkerPop's RepeatStep. The idea being that
with g.V().repeat(out()).times(x) only x round trips to the db is

[GitHub] tinkerpop issue #819: Use error-prone compiler

2018-04-28 Thread spmallette
Github user spmallette commented on the issue:

https://github.com/apache/tinkerpop/pull/819
  
sorry - i didn't see your reply earlier i the week. i think this PR even 
without the build changes had problems. the PR branch wasn't building I don't 
think (i seem to remember @robertdale bringing that up at various points). was 
all that fixed? 

thus far we've only talked about the build changes as I thought that was 
the primary point of the PR so i never bothered to comment much on the changes 
themselves. since you ask about that now, i'd say that they are hard to 
evaluate as they have been submitted. Because it is one giant PR, I don't 
understand the reasoning for these changes. You do have some detailed comments 
about the changes to `TraversalStrategy` which is nice, but that body of change 
is terrifying to be honest. It wouldn't surprise me if that was the source of 
the build problems.  That sort of change to that important area of code would 
require its own pull request and could require a discussion on the dev list. 

So, reopen this PR? - I'd say "no". I'd say that if you feel really 
strongly about these changes for some reason, then they should be broken apart 
into new individual PRs so that they can each be evaluated on their own merit. 
As a reviewer (and you need 3 committers to vote +1 for us to merge), 
personally I'd be much happier to see PRs that do more than add generics or 
reformat imports because some tool said to do it. Like, if you truly have a 
much better way to sort traversal strategies - wow...that would be awesome! Or 
if traversal strategy application was bug prone and that tool your using 
uncovered that and it fixed a subtle problem no one ever noticed - excellent! 
But, please frame your PR in that fashion so that we understand what we're 
exactly evaluating. Organizing your work that way will help immensely on our 
side.

Hope that helps frame my opinion on this. I'm sure someone else will chime 
in if they think something different. Thanks!


---


[GitHub] tinkerpop issue #819: Use error-prone compiler

2018-04-28 Thread leventov
Github user leventov commented on the issue:

https://github.com/apache/tinkerpop/pull/819
  
@spmallette is it considerable?


---


Re: [Discuss] Versioning for Gremlin

2018-04-28 Thread Stephen Mallette
This could be something we try to address some day in TinkerPop 4.x. It
really shouldn't be handled by serialization versions, but it's also not as
simple as a version field. Gremlin is a little different than the standard
query language because the language is bound to the API which is versioned
by a release. In an extreme crazy example, let's say that in 3.4.0 we had:

g.V().out()

and for some reason in 3.5.0 we decided the Gremlin to do that should be:

g.V().goOuty()

We would deprecate out() and add goOuty() in 3.5.0. Then in 3.6.0 we would
probably remove out() as a method from the API. So, now you try to connect
to the 3.6.0 server with a 3.4.0 client and try to use out() because that's
what the API allows and it won't work and no version field will help fix
that because 3.6.0 won't have any knowledge of what out() means anymore as
it has been removed from the API.

I have given some thought to this general problem and I think the answer
might lie in first decoupling the definition of the Gremlin language from
the Gremlin VM within which it is executed. If there could somehow then be
different VMs for each Gremlin version then you host multiple Gremlin
versions within the same server. There's probably more to it than that, but
I feel like that's where the starting point is. Again, not something we can
likely solve along the 3.x line, but I will make a note to add this as a
point to our 4.x dev/idea doc.



On Sat, Apr 28, 2018 at 12:21 AM, divijvaidy...@gmail.com <
divijvaidy...@gmail.com> wrote:

> Hi all,
>
> I would like to understand the thoughts of the community and future
> direction of development for versioning of Gremlin.
>
> Ideally, if a server supports multiple major versions of TP (let’s say TP3
> & TP4), the client (GLV) should have the capability to issue a query
> pertaining to different versions. It could probably be achieved by using
> different serialization format for different version but IMO, query
> language should not be coupled with serialziation formats. A better way to
> achieve it would be to introduce version field in the byte code message
> itself.
>
> What do you think?
>