Re: [DISCUSS] TinkerPop mantained Gremlin Language Variants

Stephen Mallette Thu, 21 Apr 2016 07:19:23 -0700

I'm not sure what you mean by:

> Why is this not an option: It boils down to the fact that you still can't 
> modify
a traversal "after the fact".


You can modify the Traversal up to the point where you iterate it and
TraversalStrategy are applied.

gremlin> base = g.V().hasLabel('person').out();[]
gremlin> filter = __.has('age',29)
gremlin> TraversalHelper.insertTraversal(1,filter,base);[]
gremlin> base.toString()
==>[GraphStep(vertex,[]), HasStep([~label.eq(person)]),
HasStep([age.eq(29)]), VertexStep(OUT,vertex)]
gremlin> base
==>v[3]
==>v[2]
==>v[4]

Note how I injected an anonymous Traversal (i.e. the "filter") into the
middle of a "base" Traversal. This is a little known function that is
probably familiar to TraversalStrategy developers only, but it exists. I
suppose more advanced forms of Gremlin Language Variants should/could have
this kind of capability - I guess the point is that it's possible and
arguably better than string manipulation. You might not agree, but am I at
least grasping the problem you're trying to express?


On Thu, Apr 21, 2016 at 9:49 AM, Dylan Millikin <dylan.milli...@gmail.com>
wrote:

> Essentially what is required is to be able to "build" traversals with a
> complete disregard for the order of the steps you're adding. Think of it as
> adding entries to a match(). It doesn't matter what order you add those
> entries in.
> This is something that is very "declarative" in nature but is a requirement
> when developing complex applications that need modularity. You also need
> this in an imperative setting as you may want control on how the data is
> fetched.
>
> On Thu, Apr 21, 2016 at 9:39 AM, Dylan Millikin <dylan.milli...@gmail.com>
> wrote:
>
> > Yeah there are some more complex situations. You actually left out the
> > final out() in your example. For instance just keeping things simple,
> > your base query could be required to be reused in many places. So the
> > following is not an option:
> >
> > base = g.V().hasLabel('person');[]
> > /** apply some filters like you did **/
> > base.out('company')
> >
> > Why is this not an option: It boils down to the fact that you still can't
> > modify a traversal "after the fact".
> > Your reusable base is not complete without the out() step. Therefore,
> > using this model you can't store the base query and test it separately.
> You
> > also have to append the out step everywhere you use this base, which in
> > addition to creating a lot of duplicate code can also not be an option
> when
> > queries are generated automatically. Sometimes you don't even really know
> > what the base query is, you just know that it contains certain "injection
> > points".
> > To give you a few examples, these following base queries all have the
> same
> > injection point that allows them to be edited with the same filters:
> >
> > g.V().hasLabel("person")/** inject here **/.out('company');
> > g.V().hasLabel("person")/** inject here **/.out('friend').out('company');
> > g.V().has("person", "name", "marko").out('friend')/** inject here
> > **/.out('company');
> >
> > These are all valid. My filters apply to "people" but the traversals
> > leading to and from these people can be anything. And programmatically I
> > have no way of knowing what the traversal is. This gets increasingly
> > complexe with injection points in sub-traversals, or when injecting full
> > traversals with their own injection points, and so on.
> >
> > Also your example doesn't cover changing existing steps.
> >
> > I don't know if that sheds anymore light.
> >
> > On Thu, Apr 21, 2016 at 9:14 AM, Stephen Mallette <spmalle...@gmail.com>
> > wrote:
> >
> >> So "query building" is where you hold string representations of steps or
> >> groups of steps and then have some logic that concatenates them
> together.
> >> I
> >> guess my question is why are "strings" a requirement for that? It seems
> >> like you could do the same query builder stuff with an actual Traversal
> >> object in whatever Gremlin Language Variant you were working with. How
> is
> >> string concatenation different than this:
> >>
> >> gremlin> graph = TinkerFactory.createModern()
> >> ==>tinkergraph[vertices:6 edges:6]
> >> gremlin> g = graph.traversal()
> >> ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
> >> gremlin> base = g.V().hasLabel('person');[]
> >> gremlin> addFilters = { age, name, t ->
> >> gremlin>   t = age > 0 ? t.has('age',age) : t
> >> gremlin>   !name.isEmpty() ? t.has('name',name) : t
> >> gremlin> }
> >> ==>groovysh_evaluate$_run_closure1@503d56b5
> >> gremlin> traversal = addFilters(29,'',base);[]
> >> gremlin> traversal.toString()
> >> ==>[GraphStep(vertex,[]), HasStep([~label.eq(person)]),
> >> HasStep([age.eq(29)])]
> >> gremlin> traversal
> >> ==>v[1]
> >>
> >> What about DSLs? Maybe filtering logic drops behind a custom step
> specific
> >> to the domain:
> >>
> >> g.V().personsWith(29, '')
> >>
> >> which would basically compile to the same thing as the gremlin output
> >> above:
> >>
> >> [GraphStep(vertex,[]), HasStep([~label.eq(person)]),
> >> HasStep([age.eq(29)])]
> >>
> >> Is there some more complex aspect "query building" that can't be handled
> >> this way (I know I took a fairly simple example)?
> >>
> >>
> >>
> >> On Wed, Apr 20, 2016 at 7:54 PM, Dylan Millikin <
> dylan.milli...@gmail.com
> >> >
> >> wrote:
> >>
> >> > I've approached this a few times in the past though never really in
> >> depth.
> >> > The idea is that you want separation of logic in your queries, mostly
> >> for
> >> > maintenance, testing and overall convenience. This takes the form of
> >> > partial/incomplete traversals that have no start nor end.
> >> > One of the simplest applications would be to have a base query and
> apply
> >> > filters to this query. For example we can imagine a recap page of
> >> > everyone's companies:
> >> >
> >> > base query : g.V().has(label, "person").out('company')
> >> >
> >> > To this you have a set of filters that are partial traversals and
> allow
> >> > users to better refine their search:
> >> >
> >> > older than 30 filter: has("age", gt(30))
> >> > male filter: has("gender", "male")
> >> >
> >> > Now depending on user input you'll want to apply either or both of the
> >> > above to "person" in order to obtain something along the lines of:
> >> >
> >> > g.V().has(label, "person").has("age", gt(30)).has("gender",
> >> > "male").out('company')
> >> >
> >> > Of course there are plenty of ways of doing the above depending on
> >> > requirements and complexity. Such as where() or as("a")....select("a")
> >> > etc..
> >> >
> >> > These are simple examples where steps are appended in various places
> but
> >> > you can imagine the same with traversal manipulation such as turning
> >> > out('company') into out('company', 'organization') where the step gets
> >> > altered. This would work the same way if you used
> union(out('company'))
> >> > instead. You would have to alter the union step like this :
> >> > union(out('company'),
> >> > out('organization'))
> >> >
> >> > All of the above are relatively simple but picture having a lot of
> very
> >> > complex traversal "filters" that sometimes have dependencies on other
> >> > filters.
> >> > All this is doable but it currently requires looking at the gremlin
> >> > building process as a query building one rather than native support
> for
> >> the
> >> > gremlin language.
> >> >
> >> > Does that make any sense?
> >> >
> >> > On Wed, Apr 20, 2016 at 7:15 PM, Stephen Mallette <
> spmalle...@gmail.com
> >> >
> >> > wrote:
> >> >
> >> > > >  It's imperative that you have the ability to group multiline
> >> scripts
> >> > > into a single query or the reactivity of your applications will
> >> greatly
> >> > > suffer.
> >> > >
> >> > > A fair point and something we might yet address as part of all this
> >> > > thinking. RemoteGraph is pretty new and it demonstrated a critical
> >> aspect
> >> > > of communications with Gremlin Server. Now we need to think about
> how
> >> to
> >> > > improve upon it.
> >> > >
> >> > > > Namely having the ability to edit/modify/add steps to an already
> >> > defined
> >> > > traversal.
> >> > >
> >> > > I don't quite follow that point...could you please elaborate?
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On Wed, Apr 20, 2016 at 4:15 PM, Dylan Millikin <
> >> > dylan.milli...@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Nice post,
> >> > > >
> >> > > > I'm going to jump straight to the end. I think that one of the
> >> problems
> >> > > of
> >> > > > doing an abstraction similar to remoteGraph is that in reality the
> >> > > overhead
> >> > > > of communicating with the server is too big to make this viable.
> >> It's
> >> > > > imperative that you have the ability to group multiline scripts
> >> into a
> >> > > > single query or the reactivity of your applications will greatly
> >> > suffer.
> >> > > >
> >> > > > The Gremlin-java base also has a few limitations that some query
> >> > builders
> >> > > > try to fix (which can only be done if you abandon the idea of a
> >> natural
> >> > > > gremlin language variant in favor of a query builder). Namely
> having
> >> > the
> >> > > > ability to edit/modify/add steps to an already defined traversal.
> >> > Though
> >> > > in
> >> > > > time it might be nice to have these be part of the original
> >> > > implementation.
> >> > > >
> >> > > > I personally love the idea of gremlin language variants. I just
> >> don't
> >> > > think
> >> > > > they're production value is any good without some extended
> >> > functionality
> >> > > > (beyond what gremlin-java currently is).
> >> > > >
> >> > > > On Wed, Apr 20, 2016 at 3:23 PM, Stephen Mallette <
> >> > spmalle...@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > > > This thread on Gremlin Language Variants has been very
> >> interesting:
> >> > > > >
> >> > > > > https://pony-poc.apache.org/thread.html/Zcazrw7k442xcwc
> >> > > > >
> >> > > > > I think that this work goes a long way to address two issues
> I've
> >> > been
> >> > > > > concerned about:
> >> > > > >
> >> > > > > 1. Greater consistency in how different languages do Gremlin
> >> > > > > 2. Less fragmentation in terms of libraries and how they work so
> >> that
> >> > > > users
> >> > > > > aren't confused with how to get started (though I don't think
> the
> >> > goal
> >> > > > here
> >> > > > > is to restrict choices or slow down innovation)
> >> > > > >
> >> > > > > One of the first things we should probably do is start thinking
> in
> >> > > terms
> >> > > > of
> >> > > > > the types of libraries that are built on TinkerPop (outside of
> >> those
> >> > > > things
> >> > > > > that are Graph Systems) and those are listed here currently:
> >> > > > >
> >> > > > > http://tinkerpop.apache.org/#graph-libraries
> >> > > > >
> >> > > > > Marko mentioned to me that he saw the libraries we listed here
> >> > breaking
> >> > > > > into three categories:
> >> > > > >
> >> > > > > 1. Gremlin Language Variants - which the other thread
> demonstrates
> >> > > quite
> >> > > > > nicely
> >> > > > > 2. Gremlin Drivers - the Gremlin Server protocol
> implementations -
> >> > > those
> >> > > > > things that send traversals to Gremlin Server and get back
> >> results.
> >> > > > > 3. OGM and others - I say "others" because there might be
> plugins
> >> and
> >> > > > other
> >> > > > > similar odds and ends
> >> > > > >
> >> > > > > I like Marko's category system here and I think that having
> these
> >> > kinds
> >> > > > of
> >> > > > > categories will help folks organize their libraries to fit into
> >> one
> >> > of
> >> > > > > these spaces and make it easier for users to know what they need
> >> to
> >> > get
> >> > > > in
> >> > > > > order to start doing TinkerPop in their language.
> >> > > > >
> >> > > > > Anyway, the category thing is just setting the stage for this
> big
> >> > > > > bombshell.  I think TinkerPop should consider maintaining the
> >> Gremlin
> >> > > > > Language Variants.
> >> > > > >
> >> > > > > Heresy! right?
> >> > > > >
> >> > > > > Well, I think it's the best way to achieve consistency across
> >> > > languages.
> >> > > > > Under this model, TinkerPop provides the base language variant
> and
> >> > > people
> >> > > > > can choose to extend upon it, but the base stays tied to our
> >> > archetype
> >> > > of
> >> > > > > Java and we end up with a much more clear story for virtually
> any
> >> > > > > programming language.
> >> > > > >
> >> > > > > So how do we do this? Slowly and deliberately. We should look to
> >> only
> >> > > > > include language variants where we:
> >> > > > >
> >> > > > > + have good automation in place (like what Marko did for
> Python),
> >> > > > > + some competence on the committer list in that language
> >> > > > > + a nice testing framework that operates in our standard
> >> > build/release
> >> > > > > process.
> >> > > > >
> >> > > > > That's setting a high bar, but if we don't keep it high, I think
> >> we
> >> > > will
> >> > > > be
> >> > > > > left unable to properly support and maintain what we hang out
> >> there.
> >> > > > >
> >> > > > > I'd also like to express that we should not be looking to
> maintain
> >> > > > language
> >> > > > > drivers. I think that should remain a third-party community
> effort
> >> > just
> >> > > > > like Graph Systems. In other words, we remain a repository for
> >> > > reference
> >> > > > > implementations for everything else. Why? Because, as it sits
> >> right
> >> > > now,
> >> > > > > just based on the level of effort for what Marko did with
> Python,
> >> > > > > maintaining a "base" Gremlin Language Variants shouldn't be
> hard.
> >> We
> >> > > > won't
> >> > > > > be building tons of add-on capabilities to the base variants -
> >> they
> >> > > will
> >> > > > > pretty much just stay on par with the java archetype.  Drivers
> on
> >> the
> >> > > > other
> >> > > > > hand have lots of implementation details, with many different
> >> > > > technologies
> >> > > > > that could be used, etc.  They have similar complexity to Graph
> >> > System
> >> > > > > implementations in many ways. I also think that the drivers can
> >> > afford
> >> > > to
> >> > > > > have different APIs and approaches without being detrimental to
> >> the
> >> > > > > community.
> >> > > > >
> >> > > > > If gremlin-js-driver wants to do:
> >> > > > >
> >> > > > > client.submit("g.V()")
> >> > > > >
> >> > > > > and gremlin-python-driver wants to do:
> >> > > > >
> >> > > > > client.send("g.V()")
> >> > > > >
> >> > > > > that's not a big deal.
> >> > > > >
> >> > > > > The last point that I'll make is that I think Gremlin Language
> >> > > Variants,
> >> > > > > that don't operate on the JVM (e.g. Jython) and use Gremlin
> >> Server,
> >> > > > should
> >> > > > > have some abstraction that is similar to RemoteGraph.
> RemoteGraph
> >> > > > exposes
> >> > > > > a DriverConnection interface that is currently implemented by
> >> > > > > gremlin-driver.  The DriverConnection is responsible for
> sending a
> >> > > > > traversal to the server and returning results. It would be nice
> if
> >> > the
> >> > > > > language variants had a similar interface that the various
> >> community
> >> > > > > drivers could implement. In that way, the user never has to do
> any
> >> > form
> >> > > > of:
> >> > > > >
> >> > > > > client.submit(someGremlinString)
> >> > > > >
> >> > > > > in any language. We really need to try to make that pattern go
> >> away
> >> > > > across
> >> > > > > the TinkerPop community.
> >> > > > >
> >> > > > > So - that's was a long email. Looking forward to hearing some
> >> > > discussion
> >> > > > on
> >> > > > > this.
> >> > > > >
> >> > > > > Stephen
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: [DISCUSS] TinkerPop mantained Gremlin Language Variants

Reply via email to