Re: [DISCUSS] Creating pattern steps to codify best practices

Stephen Mallette Tue, 08 Dec 2020 11:30:01 -0800

ah - regarding 2, ok i couldn't remember and i didn't notice where we
talked about that when I scanned the thread. I've updated the gist
accordingly:


https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31





On Tue, Dec 8, 2020 at 10:51 AM David Bechberger <[email protected]>
wrote:

> For #1 and 3 I agree.  For #2 I thought that we agreed that the properties
> within the upsert were going to be for creation only and that any
> properties that needed to be updated would be done via the standard
> property() syntax after the upsert.  e.g.
>
> g.upsertV('person', [name: 'marko'],
>                     [age: 29]).property('lang', 'java')
>
> If it does not exist then this would create a vertex with the 'name',
> 'age', and 'lang' properties.  If it does exist then it would just update
> the 'lang' property.
>
> Dave
>
> On Tue, Dec 8, 2020 at 5:57 AM Stephen Mallette <[email protected]>
> wrote:
>
> > I started a expanded this discussion to gremlin-users for a wider
> audience
> > and the thread is starting to grow:
> >
> > https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ
> >
> > I guess we'll need to summarize that discussion back here now....
> >
> > I did have some more thoughts to hang out there and figured that I
> wouldn't
> > convolute the discussion on gremlin-users with it so I will continue the
> > discussion here.
> >
> > 1, The very first couple of examples seem wrong (or at least not best
> > demonstrating the usage):
> >
> > g.upsertV('person', [name: 'marko'],
> >                     [name: 'marko', age: 29])
> > g.upsertV('person', [(T.id): 1],
> >                     [(T.id): 1, name: 'Marko'])
> >
> > should instead be:
> >
> > g.upsertV('person', [name: 'marko'],
> >                     [age: 29])
> > g.upsertV('person', [(T.id): 1],
> >                     [name: 'Marko'])
> >
> > 2. I can't recall if we made this clear anywhere but in situations where
> we
> > "get" rather than "create" do the additional properties act in an update
> > fashion to the element that was found? I think I've been working on the
> > assumption that it would, though perhaps that is not always desirable?
> >
> > 3. We really never settled up how to deal with multi/meta-properties.
> That
> > story should be clear so that when we document upsert() we include the
> > approaches for the fallback patterns that don't meet the 90% of use cases
> > we are targeting and I sense that we're saying that meta/multi-properties
> > don't fit in that bucket. So with that in mind, I don't think that the
> > following works for metaproperties:
> >
> > g.upsertV('person', [(T.id): 1],
> >                     [name:[acl: 'public'])
> >
> > as it doesn't let us set the value for the "name" just the pairs for the
> > meta-properties. I guess a user would have to fall back to:
> >
> > g.upsertV('person', [(T.id): 1]).
> >   property('name','marko','acl','public')
> >
> > // or use the additional properties syntax
> > g.upsertV('person', [(T.id): 1],[name:'marko']).
> >   properties('name').property('acl','public')
> >
> > // or if there were multi-properties then maybe...
> > g.upsertV('person', [(T.id): 1],[name:'marko']).
> >   properties('name').hasValue('marko').property('acl','public')
> >
> > As for multi-properties, I dont think we should assume that a List object
> > should be interpreted by Gremlin as a multi-property. Perhaps we just
> rely
> > on the underlying graph to properly deal with that given a schema or the
> > user falls back to:
> >
> > // this ends up as however the graph deals with a List object
> > g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])
> >
> > // this is explicit
> > g.upsertV('person', [(T.id): 1]).
> >   property(list, 'lang', 'java').
> >   property(list, 'lang, 'scala').
> >   property(list, 'lang', 'java')
> >
> > If that makes sense to everyone, I will update the gist.
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <[email protected]>
> > wrote:
> >
> > > Hello Stephen,
> > >
> > > Thanks for making that gist, its much easier to follow along with the
> > > proposed syntax there.  To your specific comments:
> > >
> > > #1 - My only worry with the term upsert is that it is not as widely a
> > used
> > > term for this sort of pattern as "Merge" (e.g. SQL, Cypher).  However I
> > > don't have a strong opinion on this, so I am fine with either.
> > > #2 - My only real objective here is to make sure that we make the
> 80-90%
> > > case easy and straightforward.  I think that having the fallback option
> > of
> > > using the current syntax for any complicated edge cases should be
> > > considered here as well. I'd appreciate your thoughts here as these are
> > > good points you bring up that definitely fall into the 80-90% use case.
> > > #3 - Those points make sense to me, not sure I have anything further to
> > add
> > > #4 - I don't think I would expect to have the values extracted from the
> > > traversal filters but I'd be interested in other opinions on that.
> > >
> > > Thanks,
> > > Dave
> > >
> > > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <[email protected]
> >
> > > wrote:
> > >
> > > > It's been a couple weeks since we moved this thread so I went back
> > > through
> > > > it and refreshed my mind with what's there. Dave, nice job putting
> all
> > > > those examples together. I've recollected them all together and have
> > just
> > > > been reviewing them in a more formatted style with proper Groovy
> syntax
> > > > than what can be accomplished in this mailing list. Anyone who cares
> to
> > > > review in that form can do so here (i've pasted its markdown contents
> > at
> > > > the bottom of this post as well):
> > > >
> > > > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
> > > >
> > > > In this light, I have the following comments and thoughts:
> > > >
> > > > 1. I feel more inclined toward saving "merge" for a more generalized
> > step
> > > > of that name and have thus renamed the examples to use "upsert".
> > > >
> > > > 2. I still don't quite feel comfortable with meta/multi-properties
> > > > examples. Perhaps you could explain further as it's possible I'm not
> > > > thinking things through properly. For meta-properties the example is:
> > > >
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > >
> > > > So I see that "name" takes a Map here, but how does Gremlin know if
> > that
> > > > means "meta-property" or "Map value" and if the former, then how do
> you
> > > set
> > > > the value of "name"?  My question is similar to the multi-property
> > > examples
> > > > - using this one:
> > > >
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > >
> > > > How does Gremlin know if the user means that "lang" should be a
> > > > "multi-property" with Cardinality.list (or Cardinality.set for that
> > > matter)
> > > > or a value of Cardinality.single with a List/Array value? I can
> perhaps
> > > > suggest some solutions here but wanted to make sure I wasn't just
> > > > misunderstanding something.
> > > >
> > > > 3. The "upsert with stream" example is:
> > > >
> > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > >   upsertV(values(label), valueMap(id), valueMap())
> > > >
> > > > That would establish an API of upsertV(Traversal, Traversal,
> Traversal)
> > > > which would enable your final thought of your last post with:
> > > >
> > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'),
> > > >                         [active: true])
> > > >
> > > > The open-ended nature of that is neat but comes with some complexity
> > with
> > > > steps that traditionally take an "open" Traversal (and this step now
> > has
> > > up
> > > > to three of them) - perhaps, that's our own fault in some ways. For
> the
> > > > "stream" concept, values()/valueMap() won't immediately work that way
> > as
> > > > they only apply to Element...given today's Gremlin semantics, I think
> > > we'd
> > > > re-write that as:
> > > >
> > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > >   upsertV(select(label), select(id), select(id,label,'name'))
> > > >
> > > > Though, even that doesn't quite work because you can't select(T)
> right
> > > now
> > > > which means you'd need to go with:
> > > >
> > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > >          [id: 2, label: 'person', name: 'josh']).
> > > >   upsertV(select('label'), select('id'), select('id','label','name'))
> > > >
> > > > not terrible in a way and perhaps we could fix select(T) - can't
> > remember
> > > > why that isn't allowed except that select() operates over multiple
> > scopes
> > > > and perhaps trying T through those scopes causes trouble.
> > > >
> > > > 4. Continuing with the use of Traversal as arguments, let's go back
> to:
> > > >
> > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'),
> > > >                         [active: true])
> > > >
> > > > I originally had it in my mind to prefer this approach over a Map for
> > the
> > > > search as it provides a great deal of flexibility and fits the common
> > > > filtering patterns that users follow really well. I suppose the
> > question
> > > is
> > > > what to do in the situation of "create". Would we extract all the
> > strict
> > > > equality filter values from the initial has(), thus, in the example,
> > > > "state" but not "age", and create a vertex that has "state=NY,
> > > > active=true"? That is possible with how things are designed in
> Gremlin
> > > > today I think.
> > > >
> > > > This syntax does create some conflict with the subject of 3 and
> streams
> > > > because the traverser in the streams case is the incoming Map but for
> > > this
> > > > context it's meant as a filter of V(). Getting too whacky?
> > > >
> > > > 5. I like the idea of using a side-effect to capture whether the
> > element
> > > > was created or not. that makes sense. I think that can work.
> > > >
> > > > ==============================
> > > > = gist contents
> > > > ==============================
> > > >
> > > > # API
> > > >
> > > > ```java
> > > > upsertV(String label, Map matchOrCreateProperties)
> > > > upsertV(String label, Map matchOrCreateProperties, Map
> > > > additionalProperties)
> > > > upsertV(Traversal label, Traversal matchOrCreateProperties)
> > > > upsertV(Traversal label, Traversal matchOrCreateProperties, Traversal
> > > > additionalProperties)
> > > > upsertV(...).
> > > >   with(WithOptions.sideEffectLabel, 'a')
> > > >
> > > > upsertE(String label, Map matchOrCreateProperties)
> > > > upsertE(Traversal label, Traversal matchOrCreateProperties)
> > > > upsertE(String label, Map matchOrCreateProperties, Map
> > > > additionalProperties)
> > > > upsertE(Traversal label, Traversal matchOrCreateProperties, Traversal
> > > > additionalProperties)
> > > > upsertE(...).
> > > >   from(Vertex incidentOut).to(Vertex incidentIn)
> > > >   with(WithOptions.sideEffectLabel, 'a')
> > > > ```
> > > >
> > > > # Examples
> > > >
> > > > ## upsert
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [name: 'marko'],
> > > >                     [name: 'marko', age: 29])
> > > > ```
> > > >
> > > > ## upsert with id
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name: 'Marko'])
> > > > ```
> > > >
> > > > ## upsert with Meta Properties
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, name:[first: Marko', last: 'R'])
> > > > ```
> > > >
> > > > ## upsert with Multi Properties
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1,lang: ['java', 'scala'])
> > > > ```
> > > >
> > > > ## upsert with Single Cardinality
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: 'java')
> > > > ```
> > > >
> > > > ## upsert with List Cardinality
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: ['java', 'scala', 'java'])
> > > > ```
> > > >
> > > > ## upsert with Set Cardinality
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: ['java','scala'])
> > > > ```
> > > >
> > > > ## upsert with List Cardinality - and add value to list
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [(T.id): 1],
> > > >                     [(T.id): 1, lang: ['java','scala','java']).
> > > >   property(Cardinality.list, 'lang', 'java')
> > > > ```
> > > >
> > > > ## upsert with stream
> > > >
> > > > ```groovy
> > > > // doesn't work with today's Gremlin semantics
> > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
> > > >          [(id): 2, (label): 'person', name: 'josh']).
> > > >   upsertV(values(label), valueMap(id), valueMap())
> > > >
> > > > // the following would be more in line with what's possible with
> > existing
> > > > Gremlin semantics:
> > > > g.inject([id: 1, label: 'person', name: 'marko'],
> > > >          [id: 2, label: 'person', name: 'josh']).
> > > >   upsertV(select('label'), select('id'), select('id','label','name'))
> > > > ```
> > > >
> > > > ## upsert with reporting added or updated side effect
> > > >
> > > > ```groovy
> > > > g.upsertV('person', [name: 'marko'],
> > > >                     [name: 'marko', age: 29]).
> > > >     with(WithOptions.sideEffectLabel, 'a').
> > > >   project('vertex', 'added').
> > > >     by().
> > > >     by(cap('a'))
> > > > ```
> > > >
> > > > ## upsert Edge assuming a self-relation of "marko"
> > > >
> > > > ```groovy
> > > > g.V().has('person','name','marko').
> > > >   upsertE( 'self', [weight:0.5])
> > > > ```
> > > >
> > > > ## upsert Edge with incident vertex checks
> > > >
> > > > ```groovy
> > > > g.upsertE('knows', [weight:0.5]).
> > > >    from(V(1)).to(V(2))
> > > > g.V().has('person','name','marko').
> > > >   upsertE( 'knows', [weight:0.5]).
> > > >     to(V(2))
> > > > ```
> > > >
> > > > ## upsert using a Traversal for the match
> > > >
> > > > ```groovy
> > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state',
> > 'NY'),
> > > >                         [active: true])
> > > > ```
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <
> [email protected]>
> > > > wrote:
> > > >
> > > > > So, I've been doing some additional thinking about the ways that
> this
> > > > could
> > > > > work based on the comments below and have put my comments inline.
> > > > >
> > > > > Dave
> > > > >
> > > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <
> > [email protected]
> > > >
> > > > > wrote:
> > > > >
> > > > > > I added some thoughts inline below:
> > > > > >
> > > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
> > > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for the detailed comments Stephen.  I have addressed
> them
> > > > inline
> > > > > > > below.
> > > > > > >
> > > > > > > I did read the proposal from earlier and I think that we are in
> > > close
> > > > > > > agreement with what we are trying to accomplish.  I also fully
> > > > support
> > > > > > > Josh's comment on providing a mechanism for submitting a map of
> > > > > > properties
> > > > > > > as manually unrolling this all right now leads to a lot of
> > > potential
> > > > > for
> > > > > > > error and a long messy traversal.
> > > > > > >
> > > > > > > I'm looking forward to this discussion on how to merge these
> two
> > > > > > proposals.
> > > > > > >
> > > > > > >
> > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > proposed?
> > > > > > >
> > > > > > > My first thought here is that multi-properties would be
> > represented
> > > > as
> > > > > > > lists in the map, e.g.
> > > > > > >
> > > > > > > {names: ['Dave', 'David']}
> > > > > > >
> > > > > > > and meta-properties would be represented as maps in the maps,
> > e.g.
> > > > > > >
> > > > > > > {name: {first: 'Dave', last: 'Bechberger'}}
> > > > > > >
> > > > > > > I can't say I've thought through all the implications of this
> > > though
> > > > so
> > > > > > it
> > > > > > > is an area we would need to explore.
> > > > > > >
> > > > > >
> > > > > > The first implication that comes to mind is that it makes the
> > > > assumption
> > > > > > the user wants multi/meta-properties as opposed to a single
> > > cardinality
> > > > > of
> > > > > > List in the first case and a Map as a property value in the
> second
> > > > case.
> > > > > I
> > > > > > suppose that graphs with a schema could resolve those assumptions
> > but
> > > > > > graphs that are schemaless would have a problem. The issue could
> be
> > > > > > resolved by specialized configuration of "g" or per merge() step
> > > using
> > > > a
> > > > > > with() modulator I suppose but that goes into a yet another level
> > of
> > > > > > implications to consider. I've often wondered if the start point
> > for
> > > > > > getting types/schema into TP3 without a full rewrite would be in
> > this
> > > > > form
> > > > > > where Gremlin would be given hints as to what to expect as to the
> > > types
> > > > > of
> > > > > > data it might encounter while traversing. Anyway, I'd be hesitant
> > to
> > > go
> > > > > > down paths that don't account for multi/metaproperties well.
> They
> > > are
> > > > > > first class citizens in TP3 (with those hoping for extension of
> at
> > > > least
> > > > > > multiproperties to edges) and while I find them a constant
> > annoyance
> > > > for
> > > > > so
> > > > > > many reasons, we're kinda stuck with them.
> > > > > >
> > > > >
> > > > > I agree we need to account for multi/meta properties as 1st class
> > > > > citizens.  This is my current thinking on the syntax for the
> > situations
> > > > we
> > > > > have laid out so far:
> > > > >
> > > > > *Merge*
> > > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29})
> > > > >
> > > > >
> > > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name':
> > 'Marko'})
> > > > >
> > > > >
> > > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'name':
> > > > > {'first': Marko', 'last': 'R'})
> > > > >
> > > > >
> > > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id:
> 1,
> > > > > 'lang': ['java', 'scala'])
> > > > >
> > > > >
> > > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1}, {T.id:
> 1,
> > > > > 'lang': 'java')
> > > > >
> > > > >
> > > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'lang':
> > > > > ['java', 'scala', 'java'])
> > > > >
> > > > >
> > > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id: 1,
> > > > 'lang':
> > > > > ['java', 'scala'])
> > > > >
> > > > > Since in a mergeV() scenario we are only ever adding whatever
> values
> > > are
> > > > > passed in there would be no need to specify the cardinality of the
> > > > property
> > > > > being added.  If they wanted to add a value to an existing property
> > > then
> > > > > the current property() method would still be available on the
> output
> > > > > traversal. e.g.
> > > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala',
> > > > > 'java']).property(Cardinality.list, 'lang, 'java')
> > > > >
> > > > >
> > > > > *Merge with stream * g.inject([{id: 1, label: 'person', 'name':
> > > 'marko'},
> > > > > {id: 2, label: 'person', 'name': 'josh'}]).
> > > > >   mergeV(values('label'), valueMap('id'), valueMap())
> > > > >
> > > > >
> > > > > *Merge with reporting added or updated side effect*
> g.mergeV('name',
> > > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
> > > > >   with(WithOptions.sideEffectLabel, 'a').
> > > > >   project('vertex', 'added').
> > > > >     by().
> > > > >     by(cap('a'))
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > 2. How would users set the T.id on creation? would that T.id
> just
> > > be
> > > > a
> > > > > > key
> > > > > > > in the first Map argument?
> > > > > > >
> > > > > > > Yes, I was thinking that T.id would be the key name, e.g.:
> > > > > > >
> > > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
> > > > > > >
> > > > > >
> > > > > > ok - I can't say I see a problem with that atm.
> > > > > >
> > > > > >
> > > > > > > 3. I do like the general idea of a match on multiple properties
> > for
> > > > the
> > > > > > > first argument as a convenience but wonder about the
> specificity
> > of
> > > > > this
> > > > > > > API a bit as it focuses heavily on equality - I suppose that's
> > most
> > > > > cases
> > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > >
> > > > > > > In most cases I've seen use exact matches on the vertex or
> > edge.  I
> > > > > think
> > > > > > > it might be best to keep this straightforward as any complex
> edge
> > > > cases
> > > > > > > still can perform the same functionality using the coalesce()
> > > > pattern.
> > > > > > >
> > > > > >
> > > > > > I played around with the idea to use modulators to apply
> additional
> > > > > > constraints:
> > > > > >
> > > > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > >     by(has('age', gt(29))
> > > > > >
> > > > > > I suppose that's neat but maybe not...we could easily get the
> same
> > > > thing
> > > > > > from:
> > > > > >
> > > > > > g.V().has('person','age',gt(29)).
> > > > > >   mergeV('person', {'state':'NY'}, {'active': true}).
> > > > > >
> > > > > > which is more readable though it does make it harder for
> providers
> > to
> > > > > > optimize upsert operations if they could make use of the has() as
> > > part
> > > > of
> > > > > > that. I think I like has() as part of the main query rather than
> > in a
> > > > > by()
> > > > > > - just thinking out loud.
> > > > > >
> > > > > >
> > > > > Another potential option here would be to allow the second
> parameter
> > of
> > > > > mergeV() accept a traversal.  Not sure how much complexity that
> would
> > > add
> > > > > though
> > > > >
> > > > > g.V().mergeV('person', __.has('person','age',gt(29)).has('state',
> > > 'NY'}.,
> > > > > {'active': true}).
> > > > >
> > > > >
> > > > > >
> > > > > > > 4. I think your suggestion points to one of the troubles
> Gremlin
> > > has
> > > > > > which
> > > > > > > we see with "algorithms" - extending the language with new
> steps
> > > that
> > > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > > shortestPath() step) pollutes the core language a bit, hence my
> > > > > > > generalization of "merging" in my link above which fits into
> the
> > > core
> > > > > > > Gremlin language style. There is a bigger picture where we are
> > > > missing
> > > > > > > something in Gremlin that lets us extend the language in ways
> > that
> > > > let
> > > > > us
> > > > > > > easily introduce new steps that aren't for general purpose.
> This
> > > > issue
> > > > > is
> > > > > > > discussed in terms of "algorithms" here:
> > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> think
> > > we
> > > > > > could
> > > > > > > see how there might be some "mutation" extension steps that
> would
> > > > cover
> > > > > > > your suggested API, plus batch operations, etc. We need a way
> to
> > > add
> > > > > > > "sugar" without it interfering with the consistency of the
> core.
> > > > > > Obviously
> > > > > > > this is a bigger issue but perhaps important to solve to
> > implement
> > > > > steps
> > > > > > in
> > > > > > > the fashion you describe.
> > > > > > >
> > > > > > > I agree that the "algorithm" steps do seem a bit odd in the
> core
> > > > > language
> > > > > > > but I think the need is there.  I'd be interested in furthering
> > > this
> > > > > > > discussion but I think these "pattern" steps may or may not be
> > the
> > > > same
> > > > > > as
> > > > > > > the algorithm steps.  I'll have to think on that.
> > > > > >
> > > > > >
> > > > > >
> > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> specify
> > > what
> > > > > > > element type the first Map argument should be applied to? what
> > > about
> > > > > > > mergeVP (i.e. vertex property as it too is an element) ? That's
> > > > tricky
> > > > > > but
> > > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > > "complex
> > > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > > g.V().merge()
> > > > > > > and the label and first Map would fold to VertexStartStep (i.e.
> > > V())
> > > > > for
> > > > > > > the lookup and then a MergeStep would follow - thus a "complex
> > > > > modulator"
> > > > > > > as it does more than just change the behavior of the previous
> > step
> > > -
> > > > it
> > > > > > > also adds its own. I suppose it could also add has() steps
> > followed
> > > > by
> > > > > > the
> > > > > > > MergeStep and then the has() operations would fold in normally
> as
> > > > they
> > > > > do
> > > > > > > today. In this way, we can simplify to just one single
> > > > > > > merge(String,Map,Map). ??
> > > > > > >
> > > > > > > I agree that we should also think about how to include
> properties
> > > in
> > > > > this
> > > > > > > merge construct.  The reason I was thinking about mergeV() and
> > > > mergeE()
> > > > > > is
> > > > > > > that it follows the same pattern as the already well understood
> > > > > > > addV()/addE() steps.  I am a bit wary of trying to generalize
> > this
> > > > down
> > > > > > to
> > > > > > > a single merge() step as these sorts of complex overloads make
> it
> > > > hard
> > > > > to
> > > > > > > figure out which Gremlin step you should use for a particular
> > > pattern
> > > > > > (e.g.
> > > > > > > upsert vertex or upsert edge).  One thing that I think
> customers
> > > find
> > > > > > > useful about the addV/addE step is that they are very
> > discoverable,
> > > > the
> > > > > > > name tells me what functionality to expect from that step.
> > > > > > >
> > > > > >
> > > > > > I'm starting to agree with the idea that we have to do something
> > like
> > > > > > mergeV()/E() as it wouldn't quite work as a start step otherwise.
> > We
> > > > > > couldn't do:
> > > > > >
> > > > > > g.merge()
> > > > > >
> > > > > > as we wouldnt know what sort of Element it applied to. If so, I
> > > wonder
> > > > if
> > > > > > it would be better to preserve "merge" for my more general
> use-case
> > > and
> > > > > > prefer upsertV()/E()?
> > > > > >
> > > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps it is an
> > > > > uncommon
> > > > > > case (compared to V/E()) and we don't really have addVP() as an
> > > > analogous
> > > > > > step. Perhaps existing coalesce() patterns and/or my more general
> > > > purpose
> > > > > > merge() step would satisfy those situations??
> > > > > >
> > > > >
> > > > > I think that a mergeVP() type step may require a different pattern
> > > since
> > > > it
> > > > > would really need to accept a map of values otherwise it would
> > > > essentially
> > > > > perform the same function as property()
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > 6. One thing neither my approach nor yours seems to do is tell
> > the
> > > > user
> > > > > > if
> > > > > > > they created something or updated something - that's another
> > thing
> > > > I've
> > > > > > > seen users want to have in get-or-create. Here again we go
> deeper
> > > > into
> > > > > a
> > > > > > > less general step specification as alluded to in 4, but a
> merge()
> > > > step
> > > > > as
> > > > > > > proposed in 5, might return [Element,boolean] so as to provide
> an
> > > > > > indicator
> > > > > > > of creation?
> > > > > > >
> > > > > > > Hmm, I'll have to think about that.  How would returning
> multiple
> > > > > values
> > > > > > > work if we want to chain these together. e.g. Add a vertex
> > between
> > > > two
> > > > > > > edges but make sure the vertices exist?
> > > > > > >
> > > > > >
> > > > >
> > > > > Added a thought on how to accomplish this above that I'd like to
> get
> > > > > thoughts on.
> > > > >
> > > > >
> > > > > > >
> > > > > > > 7. You were just introducing your ideas here, so perhaps you
> > > haven't
> > > > > > gotten
> > > > > > > this far yet, but a shortcoming to doing merge(String,Map,Map)
> is
> > > > that
> > > > > it
> > > > > > > leaves open no opportunity to stream a List of Maps to a
> merge()
> > > for
> > > > a
> > > > > > form
> > > > > > > of batch loading which is mighty common and one of the
> variations
> > > of
> > > > > the
> > > > > > > coalesce() pattern that I alluded to at the start of all this.
> I
> > > > think
> > > > > > that
> > > > > > > we would want to be sure that we left open the option to do
> that
> > > > > somehow.
> > > > > > > 8. If we had a general purpose merge() step I wonder if it
> makes
> > > > > > developing
> > > > > > > the API as you suggested easier to do?
> > > > > > >
> > > > > > > Hmm, let me think about that one.
> > > > > > >
> > > > > > >
> > > > > > I will add an item 8 to think about which I didn't mention
> before:
> > > > > >
> > > > > > 8. The signature you suggested for mergeE() is:
> > > > > >
> > > > > > mergeE(String, Map, Map)
> > > > > >      String - The edge label
> > > > > >      Map (first) - The properties to match existing edge on
> > > > > >      Map (second) - Any additional properties to set if a new
> edge
> > is
> > > > > > created (optional)
> > > > > >
> > > > > > but does that exactly answer the need? Typically the idea is to
> > > detect
> > > > an
> > > > > > edge between two vertices not globally in the graph by way of
> some
> > > > > > properties. This signature doesn't seem to allow for that as it
> > > doesn't
> > > > > > allow specification of the vertices to test against. Perhaps the
> > > answer
> > > > > is
> > > > > > to use from() and to() modulators?
> > > > > >
> > > > > > g.mergeE('knows', {'weight':0.5}).
> > > > > >   from(V(1)).to(V(2))
> > > > > > g.V().has('person','name','marko').
> > > > > >   mergeE( 'knows', {'weight':0.5}).
> > > > > >     to(V(2))
> > > > > > g.V().has('person','name','marko').
> > > > > >   mergeE( 'self', {'weight':0.5})
> > > > > >
> > > > > > That seems to work?
> > > > > >
> > > > >
> > > > > I think that is an elegant solution to that problem.  I like it and
> > it
> > > > > keeps in line with the way that addE() works.
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
> > > > [email protected]
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I like our coalesce() pattern but it is verbose and over time
> > it
> > > > has
> > > > > > gone
> > > > > > > > from a simple pattern to one with numerous variations for all
> > > > manner
> > > > > of
> > > > > > > > different sorts of merge-like operations. As such, I do think
> > we
> > > > > should
> > > > > > > > introduce something to cover this pattern.
> > > > > > > >
> > > > > > > > I like that you used the word "merge" in your description of
> > this
> > > > as
> > > > > it
> > > > > > > is
> > > > > > > > the word I've liked using. You might want to give a look at
> my
> > > > > proposed
> > > > > > > > merge() step from earlier in the year:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
> > > > > > > >
> > > > > > > > I'm just going to dump thoughts as they come regarding what
> you
> > > > > wrote:
> > > > > > > >
> > > > > > > > 1. How would multi/meta-properties fit into the API you've
> > > > proposed?
> > > > > > > > 2. How would users set the T.id on creation? would that T.id
> > just
> > > > be
> > > > > a
> > > > > > > key
> > > > > > > > in the first Map argument?
> > > > > > > > 3. I do like the general idea of a match on multiple
> properties
> > > for
> > > > > the
> > > > > > > > first argument as a convenience but wonder about the
> > specificity
> > > of
> > > > > > this
> > > > > > > > API a bit as it focuses heavily on equality - I suppose
> that's
> > > most
> > > > > > cases
> > > > > > > > for get-or-create, so perhaps that's ok.
> > > > > > > > 4. I think your suggestion points to one of the troubles
> > Gremlin
> > > > has
> > > > > > > which
> > > > > > > > we see with "algorithms" - extending the language with new
> > steps
> > > > that
> > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up with
> > > > > > > > shortestPath() step) pollutes the core language a bit, hence
> my
> > > > > > > > generalization of "merging" in my link above which fits into
> > the
> > > > core
> > > > > > > > Gremlin language style. There is a bigger picture where we
> are
> > > > > missing
> > > > > > > > something in Gremlin that lets us extend the language in ways
> > > that
> > > > > let
> > > > > > us
> > > > > > > > easily introduce new steps that aren't for general purpose.
> > This
> > > > > issue
> > > > > > is
> > > > > > > > discussed in terms of "algorithms" here:
> > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I
> > think
> > > > we
> > > > > > > could
> > > > > > > > see how there might be some "mutation" extension steps that
> > would
> > > > > cover
> > > > > > > > your suggested API, plus batch operations, etc. We need a way
> > to
> > > > add
> > > > > > > > "sugar" without it interfering with the consistency of the
> > core.
> > > > > > > Obviously
> > > > > > > > this is a bigger issue but perhaps important to solve to
> > > implement
> > > > > > steps
> > > > > > > in
> > > > > > > > the fashion you describe.
> > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
> > specify
> > > > what
> > > > > > > > element type the first Map argument should be applied to?
> what
> > > > about
> > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
> That's
> > > > > tricky
> > > > > > > but
> > > > > > > > I don't think we should miss that. Perhaps merge() could be a
> > > > > "complex
> > > > > > > > modulator"?? that's a new concept of course, but you would do
> > > > > > > g.V().merge()
> > > > > > > > and the label and first Map would fold to VertexStartStep
> (i.e.
> > > > V())
> > > > > > for
> > > > > > > > the lookup and then a MergeStep would follow - thus a
> "complex
> > > > > > modulator"
> > > > > > > > as it does more than just change the behavior of the previous
> > > step
> > > > -
> > > > > it
> > > > > > > > also adds its own. I suppose it could also add has() steps
> > > followed
> > > > > by
> > > > > > > the
> > > > > > > > MergeStep and then the has() operations would fold in
> normally
> > as
> > > > > they
> > > > > > do
> > > > > > > > today. In this way, we can simplify to just one single
> > > > > > > > merge(String,Map,Map). ??
> > > > > > > > 6. One thing neither my approach nor yours seems to do is
> tell
> > > the
> > > > > user
> > > > > > > if
> > > > > > > > they created something or updated something - that's another
> > > thing
> > > > > I've
> > > > > > > > seen users want to have in get-or-create. Here again we go
> > deeper
> > > > > into
> > > > > > a
> > > > > > > > less general step specification as alluded to in 4, but a
> > merge()
> > > > > step
> > > > > > as
> > > > > > > > proposed in 5, might return [Element,boolean] so as to
> provide
> > an
> > > > > > > indicator
> > > > > > > > of creation?
> > > > > > > > 7. You were just introducing your ideas here, so perhaps you
> > > > haven't
> > > > > > > gotten
> > > > > > > > this far yet, but a shortcoming to doing
> merge(String,Map,Map)
> > is
> > > > > that
> > > > > > it
> > > > > > > > leaves open no opportunity to stream a List of Maps to a
> > merge()
> > > > for
> > > > > a
> > > > > > > form
> > > > > > > > of batch loading which is mighty common and one of the
> > variations
> > > > of
> > > > > > the
> > > > > > > > coalesce() pattern that I alluded to at the start of all
> this.
> > I
> > > > > think
> > > > > > > that
> > > > > > > > we would want to be sure that we left open the option to do
> > that
> > > > > > somehow.
> > > > > > > > 8. If we had a general purpose merge() step I wonder if it
> > makes
> > > > > > > developing
> > > > > > > > the API as you suggested easier to do?
> > > > > > > >
> > > > > > > > I think I'd like to solve the problems you describe in your
> > post
> > > as
> > > > > > well
> > > > > > > as
> > > > > > > > the ones in mine. There is some relation there, but gaps as
> > well.
> > > > > With
> > > > > > > more
> > > > > > > > discussion here we can figure something out.
> > > > > > > >
> > > > > > > > Thanks for starting this talk - good one!
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
> > > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I've had a few on and off discussions with a few people
> here,
> > > so
> > > > I
> > > > > > > wanted
> > > > > > > > > to send this out to everyone for feedback.
> > > > > > > > >
> > > > > > > > > What are people's thoughts on creating a new set of steps
> > that
> > > > > codify
> > > > > > > > > common Gremlin best practices?
> > > > > > > > >
> > > > > > > > > I think there are several common Gremlin patterns where
> users
> > > > would
> > > > > > > > benefit
> > > > > > > > > from the additional guidance that these codified steps
> > > represent.
> > > > > > The
> > > > > > > > > first one I would recommend though is codifying the element
> > > > > existence
> > > > > > > > > pattern into a single Gremlin step, something like:
> > > > > > > > >
> > > > > > > > > mergeV(String, Map, Map)
> > > > > > > > >      String - The vertex label
> > > > > > > > >      Map (first) - The properties to match existing
> vertices
> > on
> > > > > > > > >      Map (second) - Any additional properties to set if a
> new
> > > > > vertex
> > > > > > is
> > > > > > > > > created (optional)
> > > > > > > > > mergeE(String, Map, Map)
> > > > > > > > >      String - The edge label
> > > > > > > > >      Map (first) - The properties to match existing edge on
> > > > > > > > >      Map (second) - Any additional properties to set if a
> new
> > > > edge
> > > > > is
> > > > > > > > > created (optional)
> > > > > > > > >
> > > > > > > > > In each of these cases these steps would perform the same
> > > upsert
> > > > > > > > > functionality as the element existence pattern.
> > > > > > > > >
> > > > > > > > > Example:
> > > > > > > > >
> > > > > > > > > g.V().has('person','name','stephen').
> > > > > > > > >            fold().
> > > > > > > > >            coalesce(unfold(),
> > > > > > > > >                     addV('person').
> > > > > > > > >                       property('name','stephen').
> > > > > > > > >                       property('age',34))
> > > > > > > > >
> > > > > > > > > would become:
> > > > > > > > >
> > > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
> > > > > > > > >
> > > > > > > > > I think that this change would be a good addition to the
> > > language
> > > > > for
> > > > > > > > > several reasons:
> > > > > > > > >
> > > > > > > > > * This codifies the best practice for a specific
> > action/recipe,
> > > > > which
> > > > > > > > > reduces the chance that someone uses the pattern
> incorrectly
> > > > > > > > > * Most complex Gremlin traversals are verbose.  Reducing
> the
> > > > amount
> > > > > > of
> > > > > > > > code
> > > > > > > > > that needs to be written and maintained allows for a better
> > > > > developer
> > > > > > > > > experience.
> > > > > > > > > * It will lower the bar of entry for a developer by making
> > > these
> > > > > > > actions
> > > > > > > > > more discoverable.  The more we can help bring these
> patterns
> > > to
> > > > > the
> > > > > > > > > forefront of the language via these pattern/meta steps the
> > more
> > > > we
> > > > > > > guide
> > > > > > > > > users towards writing better Gremlin faster
> > > > > > > > > * This allows DB vendors to optimize for this pattern
> > > > > > > > >
> > > > > > > > > I know that this would likely be the first step in Gremlin
> > that
> > > > > > > codifies
> > > > > > > > a
> > > > > > > > > pattern, so I'd like to get other's thoughts on this?
> > > > > > > > >
> > > > > > > > > Dave
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Reply via email to