Hi, > I disagree that this is desired behavior. The graph is the raw structure of > the data and a traversal gives you a "view" of that data. That > characterization makes sense. So, let's say we have a traversal with > PartitionGraphStrategy added to our traversal g, then > t = g.V().out() > gives us a few of the graph restricted to the data that a particular user > can see. However, I see a problem in that the raw graph elements can leak > out of that view. If I iterate this traversal out, I get normal vertex > objects and from there I can access anything I want. From a conceptual > perspective, we have the raw data leaking out of the view which I think > will be very confusing to users. In particular, the more a traversal…
This can happen regardless of TraversalSource. This was possible with GraphStrategies. > Now, for the argument that you shouldn't be doing that and if you want to > do something to the elements in a traversal you should do that in the > traversal itself: I agree with that sentiment and the idea of traversal > becoming the query language and that's all you ever use. However, if that > is the case, then we should consider not returning elements at all. > By analogy to SQL, SQL doesn't allow you to accidentally (or not) slip out > of the relational algebra and start manipulating database records directly > (some earlier systems actually allowed that for performance reasons though > (I assume) it was quickly realized what a horrible idea that is). That's > kind of what TinkerPop3 does right now. A graph traversal should be > self-contained and sealed to avoid such conceptual leakage. I have also thought about not letting elements be returned by a Traversal -- only primitives. But thought it was too restrictive so left it as it is. Its very easy for vendors/application developers to provide a strategy to restrict that (see my next comment). > Here are my 2 cents on a resolution: It seems we are in agreement that > people should write traversals to produce the result set they are > interested in and not do Blueprints style "coding" to get there. With the > introduced nested traversals. modifiers and all the other new features it > should indeed be possible to do that 90%+ of the time without using lambdas. > We could go the route of simply not returning elements in traversals at all > (but only some projection of them, like a valueMap). However, that leaves a > small percentage of use cases where do want to get elements (for instance > if you need a lambda step inside your traversal or absolutely want to get a > vertex back). In those cases, TP3 should simply wrap the element into a > TraversalElement which holds a pointer to the source traversal so that it > remains within the "view". I don't think wrappers is a good idea. I think if people want to have such restrictions, we simply provide "PrimitivesOnlyVerificationStrategy" which doesn't allow VertexStep, VertexEdgeStep, or PropertiesStep to be the last step in a traversal. Easy peasy -- < 10 lines of code. Moreover, it would be like LambdaTraversalStrategy, with the ability to turn it on and off as needed. Take care, Marko. > > > > > > On Mon, Apr 13, 2015 at 2:52 PM Marko Rodriguez <[email protected]> > wrote: > >> Hi, >> >> Yep. The GraphTraversal is your "query." Get the data you want from your >> query, >> >> In SQL, do you want a Row back or do you want a String, a List of Strings, >> a Map of counts, etc? >> >> Finally, if there is something you want to do that can't be done with the >> provided steps, then use a lambda. >> >> Marko. >> >> http://markorodriguez.com >> >> On Apr 13, 2015, at 3:44 PM, Matt Frantz <[email protected]> >> wrote: >> >>> It's true that doing things The Right Way takes a bit of discipline. >> When >>> I first started with TP3, I wanted to get the vertices and then do >>> post-processing in the application. Matthias's point (if I understand >> it) >>> is that this "what can I do with a vertex" approach leads to suboptimal >>> implementations. Expressing what you want in lambda-free Gremlin is the >>> goal. So the rule of thumb is to return to your original traversal and >>> keep extending it until it does everything you want to do. >>> >>> On Mon, Apr 13, 2015 at 2:27 PM, Marko Rodriguez <[email protected]> >>> wrote: >>> >>>> Hi, >>>> >>>> Yea, there could be a step that yields Traversals if you plan to >> traverse >>>> off the returns. But then why not have the logic in your original >> traversal? >>>> >>>> We have to think of Graph as a data structure of vertices and edges. >> Then >>>> there are TraversalSources. When you put the graph into these traversal >>>> sources and you get a "view of the graph" from the perspective of the >> DSL. >>>> If you are getting our a vertex, its a vertex. Thats that. However, >> what do >>>> you want with that vertex? Its id? Well, end with id(). Its label, well >> end >>>> with label(). So forth and so on… end the GraphTraversal with the >> ultimate >>>> result you want. >>>> >>>> Thanks, >>>> Marko. >>>> >>>> http://markorodriguez.com >>>> >>>> On Apr 13, 2015, at 3:09 PM, Matt Frantz <[email protected]> >>>> wrote: >>>> >>>>> I guess what you want to avoid is a new set of interfaces like >>>>> VertexForTraversal, EdgeForTraversal, etc. That's a fair point. >>>>> >>>>> What a developer has to do now is something like this: >>>>> >>>>> t = g.traversal().V().out() >>>>> while (t.hasNext()) { >>>>> v = t.next(); >>>>> vt = g.traversal().V(v); >>>>> vt.out()...; >>>>> } >>>>> >>>>> In effect, the proposed "forTraversal" (or perhaps "asTraversal" or >> just >>>>> "traversal") step would simply produce those "vt" traversals. >>>>> >>>>> If you wanted both the element and the springboard, you could use >> select: >>>>> >>>>> g.traversal().V().out().as('v').traversal().as('vt').select('v', 'vt'); >>>>> >>>>> >>>>> >>>>> On Mon, Apr 13, 2015 at 1:19 PM, Marko Rodriguez <[email protected] >>> >>>>> wrote: >>>>> >>>>>> Technically, that is possible. >>>>>> >>>>>> Would I implement it, no. Wrappers just lead to problems as we have >> seen >>>>>> with Graph strategies. >>>>>> >>>>>> Marko. >>>>>> >>>>>> http://markorodriguez.com >>>>>> >>>>>> On Apr 13, 2015, at 2:14 PM, Matt Frantz <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> What about a step that would wrap the elements, so that the developer >>>>>> could >>>>>>> decide if she wanted them to be springboards for subsequent >> traversals? >>>>>>> >>>>>>> g.traversal().V().out().forTraversal() >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Apr 13, 2015 at 10:17 AM, Marko Rodriguez < >>>> [email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> No, as that reference does not exist and to add it to every Element >>>>>>>> produced would be exceeding expensive --- not only from a 64-bit >>>>>> reference >>>>>>>> standpoint, but also from a threading standpoint. To make it work, >> you >>>>>>>> would have to wrap each Element produced and that would be an Object >>>>>>>> wrapper with a 64-bit reference. Eek. And then in OLAP, where >> Elements >>>>>> are >>>>>>>> created all over the cluster, what 64-bit reference to use?! -- >> which >>>>>> JVM? >>>>>>>> >>>>>>>> Marko. >>>>>>>> >>>>>>>> http://markorodriguez.com >>>>>>>> >>>>>>>> On Apr 13, 2015, at 10:55 AM, Matt Frantz < >> [email protected] >>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Could Element.traversal() be a shortcut for returning to the >>>>>>>>> TraversalSource that produced the Element? >>>>>>>>> >>>>>>>>> On Mon, Apr 13, 2015 at 9:07 AM, Marko Rodriguez < >>>> [email protected] >>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> They are stateless. Create once -- use over and over and over. >>>>>>>>>> >>>>>>>>>> Marko. >>>>>>>>>> >>>>>>>>>> http://markorodriguez.com >>>>>>>>>> >>>>>>>>>> On Apr 13, 2015, at 10:01 AM, Bryn Cooke <[email protected]> >>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Marko, >>>>>>>>>>> >>>>>>>>>>> What is the recommended scope of a TraversalSource? >>>>>>>>>>> >>>>>>>>>>> Per graph? >>>>>>>>>>> Per thread? >>>>>>>>>>> >>>>>>>>>>> Should I be pooling them? >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> >>>>>>>>>>> Bryn >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 13/04/15 16:57, Marko Rodriguez wrote: >>>>>>>>>>>> Hi Matt, >>>>>>>>>>>> >>>>>>>>>>>> Yes, that is possible and easy to do, but I would not add it as >> we >>>>>>>> need >>>>>>>>>> to stress to people to always go through the same TraversalSource. >>>>>>>>>>>> >>>>>>>>>>>> The importance of TraversalSource can not be overstated. It is >>>>>>>>>> impossible to just have Vertex.out() for the following reasons: >>>>>>>>>>>> >>>>>>>>>>>> 1. GraphTraversal is just one type of DSL. >>>>>>>>>>>> 2. Ignoring 1, then what is the traversal engine that will >>>>>> execute >>>>>>>>>> Vertex.out()? Spark, Giraph, standard iterator, GremlinServer, >> etc.? >>>>>>>>>>>> 3. What are the strategies you are applying? You might have >>>>>>>>>> ReadOnlyStrategy on g.V(), but then you v.out().remove(). >> Strategies >>>>>>>> gone… >>>>>>>>>>>> >>>>>>>>>>>> TraversalSource is your "traversal context." Users should always >>>> use >>>>>>>>>> this. If they want low level methods, they can, but they are not >>>>>>>> guaranteed: >>>>>>>>>>>> >>>>>>>>>>>> 1. An execution engine. >>>>>>>>>>>> 2. A set of strategies. >>>>>>>>>>>> 3. DSL method chaining. >>>>>>>>>>>> >>>>>>>>>>>> While we can do v.traversal().out(), you are then creating a new >>>>>>>>>> TraversalSource. This is expensive and diverts the user from using >>>> the >>>>>>>>>> original TraversalSource. For instance, lets say you are working >>>> with >>>>>>>>>> SparkGraphComputer, the you would have to do this: >>>>>>>>>>>> >>>>>>>>>>>> v.traversal(computer(SparkComputerEngine)).out() >>>>>>>>>>>> >>>>>>>>>>>> This creates a new TraversalSource, traversal engine, graph >>>>>>>> references, >>>>>>>>>> etc… its just not "the way." >>>>>>>>>>>> >>>>>>>>>>>> Marko. >>>>>>>>>>>> >>>>>>>>>>>> http://markorodriguez.com >>>>>>>>>>>> >>>>>>>>>>>> On Apr 13, 2015, at 9:42 AM, Matt Frantz < >>>>>> [email protected]> >>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Could something similar to what was done in splitting Graph and >>>>>>>>>>>>> GraphTraversalSource happen with Vertex/Edge? That is: >>>>>>>>>>>>> >>>>>>>>>>>>> v.traversal().out()... >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Apr 13, 2015 at 7:28 AM, Marko Rodriguez < >>>>>>>> [email protected] >>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> You can't start a traversal from any element because >>>>>> GraphTraversal >>>>>>>> is >>>>>>>>>>>>>> just one type of DSL. For instance, >>>>>>>>>>>>>> >>>>>>>>>>>>>> vertex.friends().name() >>>>>>>>>>>>>> >>>>>>>>>>>>>> …would not exist as methods. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Finally, users can do vertex.edges() if they please, but its >> not >>>>>>>> from >>>>>>>>>> a >>>>>>>>>>>>>> TraversalSource so its not "DSL"'d. If you want optimizations, >>>>>>>> method >>>>>>>>>>>>>> chaining, etc., everything must go through a TraversalSource, >> if >>>>>>>> not, >>>>>>>>>> its >>>>>>>>>>>>>> "raw methods." >>>>>>>>>>>>>> >>>>>>>>>>>>>> Marko. >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://markorodriguez.com >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Apr 13, 2015, at 3:39 AM, Bryn Cooke <[email protected]> >>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have to agree, >>>>>>>>>>>>>>> The loss of being able to start a traversal at an element is >> a >>>>>> real >>>>>>>>>>>>>> blow, although I'm sure it was done for good reasons. >>>>>>>>>>>>>>> Here are some additional considerations: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> * Graph vendors and TP users have different requirements for >> an >>>>>> API >>>>>>>>>>>>>>> that may not be unifiable in a satisfactory way. So perhaps >> the >>>>>>>>>>>>>>> current interfaces are geared towards graph vendors and a >>>> wrapper >>>>>>>>>>>>>>> could be created for users. Without moving from interfaces to >>>>>>>>>>>>>>> abstract classes and therefore gaining the extra power of >>>>>> protected >>>>>>>>>>>>>>> scope any unified API will be difficult to achieve. >>>>>>>>>>>>>>> * Scala and Groovy have added functionality to make Gremin >>>> easier >>>>>>>> to >>>>>>>>>>>>>>> deal with. The same can and perhaps should be done for Java. >>>> Type >>>>>>>>>>>>>>> safety and syntactic sugar is available to different degrees >> in >>>>>>>> each >>>>>>>>>>>>>>> language, so perhaps we should not try too hard in gremlin >> core >>>>>> and >>>>>>>>>>>>>>> leave that to language specific bindings. In short, gremlin >>>> core >>>>>>>>>>>>>>> could be targeted to the JVM and Java/Scala/Groovy users have >>>>>>>>>>>>>>> something else that happens to allow traversals from >> elements. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Bryn >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 13/04/15 07:09, pieter-gmail wrote: >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I concur with Matthias. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>> Pieter >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 13/04/2015 01:59, Matthias Broecheler wrote: >>>>>>>>>>>>>>>>> Hi guys, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> after playing with the M8 release for a bit I wanted to >>>> discuss >>>>>>>> the >>>>>>>>>>>>>>>>> following: With M8, TP3 effectively brings back the >>>> distinction >>>>>>>>>> between >>>>>>>>>>>>>>>>> Blueprints and Gremlin, i.e. there are low level methods >> for >>>>>>>>>> accessing >>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>> vertex' adjacency list and there are the traversals. >>>>>>>>>>>>>>>>> In TP2 that was an issue because developers would start >>>>>>>>>> implementing >>>>>>>>>>>>>>>>> against Blueprints directly and treat it like a graph >>>> library - >>>>>>>> not >>>>>>>>>>>>>> like a >>>>>>>>>>>>>>>>> query language. It can be reasonably assumed that the same >>>> will >>>>>>>>>> happen >>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>> TP3. This will be further aggravated by the fact that >> element >>>>>>>>>>>>>> traversals >>>>>>>>>>>>>>>>> are no longer supported in TP3. Meaning, you can no longer >> do >>>>>>>>>>>>>>>>> v.out('knows').in('knows") but have to put the vertex back >>>> into >>>>>>>> the >>>>>>>>>>>>>>>>> GraphTraversalSource. That will be very confusing and one >> can >>>>>>>>>> expect >>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>> user's will prefer using the primitive adjacency list calls >>>>>>>>>> instead. >>>>>>>>>>>>>>>>> When you have a vertex and you try to traverse out of it, >> you >>>>>>>> will >>>>>>>>>>>>>> type in >>>>>>>>>>>>>>>>> "v." in your IDE. Lacking any other options, the user will >>>>>> select >>>>>>>>>>>>>>>>> "v.edges()", etc. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I wanted to bring this to your attention since I like the >>>>>> vision >>>>>>>> of >>>>>>>>>>>>>>>>> "everything is Gremlin". In naming this is true but I am >>>> afraid >>>>>>>>>> that >>>>>>>>>>>>>> actual >>>>>>>>>>>>>>>>> user behavior will be different. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - Why not hide the access methods in the iterator method as >>>> was >>>>>>>>>> done >>>>>>>>>>>>>> in the >>>>>>>>>>>>>>>>> last milestone release? >>>>>>>>>>>>>>>>> - Should we enforce that the GraphTraversalSource is >> attached >>>>>> to >>>>>>>>>> each >>>>>>>>>>>>>>>>> element so that traversing out of it is possible? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Matthias >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>> >>>> >> >>
