Re: [DISCUSS] Creating pattern steps to codify best practices

Stephen Mallette Mon, 21 Dec 2020 08:22:26 -0800

I gave a bit more thought to this one today. I think the stream case is
still stuck:


https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31#upsert-with-stream

Gremlin can only produce a Map<String,E2> which precludes T as a key. If
you look at the select()/project() signatures none of them nicely shift to
specification of T as part of the existing String arguments. Obviously we
could switch the String types in select() to wide open Object arguments but
I'm not sure that's a great thing as we already have enough type safety
sorts of issues with Gremlin in Java. I suppose it is less of an issue in
other languages like Python. Anyway, would still like to get this "stream"
part of upsert right....

On Tue, Dec 15, 2020 at 8:13 AM Stephen Mallette <[email protected]>
wrote:

> We may yet need to keep upsertV/E() as a new step because without that
> addV() does get a bit odd as we would then lose the nice implicit match
> behavior of:
>
> // implicitly match on name/age without having to specify by()
> g.upsertV('person', [name: 'marko', age: 29])
>
> If we were to try to trigger that functionality with Map arguments but not
> the current method we'd likely fall deeper into kelvin's area of concerns.
> I've updated the gist again with the by() options and tweaked some other
> things:
>
> https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
>
>
>
> On Mon, Dec 14, 2020 at 3:20 PM Kelvin Lawrence <[email protected]>
> wrote:
>
>> In general I like the idea of having a simpler way to express an upsert
>> without needing to know the coalesce pattern.
>> I am a little worried that if the addV and addE steps are overloaded to
>> to perform the upsert task that suddenly, steps that have always worked one
>> way, can now sometimes, do something different - not create but just find
>> an existing element. My concern is mainly around readability of queries and
>> a user knowing that a step can do certain extra things based on the
>> parameterization. This kind of goes a little against the Linux style
>> philosophy of each command doing one thing and doing it well.  Again, I'm
>> mostly raising this not because I am against the idea but wanting to make
>> sure it is very clear (maybe due to the presence of a by() modulator, that
>> a given addV/addE step is in "upsert mode". We have a few other cases where
>> steps work differently based on subtle parameterizations and people get
>> confused. Consider the case of.
>> "where(is(eq('A'))"  versus  "where(eq('A'))"
>> I also want to think a bit about how/if these can be nested. Today a see
>> a lot of nested coalesce steps in queries people write trying to do large
>> (complex) multi-part upserts.
>> Cheers,Kelvin
>> Kelvin R. Lawrence
>>
>>     On Friday, December 11, 2020, 12:36:45 PM CST, Stephen Mallette <
>> [email protected]> wrote:
>>
>>  +1 to no new step. I haven't yet thought of a reason why this shouldn't
>> just be a variation on addV/E().
>>
>> On Fri, Dec 11, 2020 at 1:03 PM David Bechberger <[email protected]>
>> wrote:
>>
>> > I agree with Josh's description of what the upsertV() functionality is
>> > intended to be.
>> >
>> > I also fully support the simplification provided by the by() modulation
>> > that Stephen suggested, removing the second map.  I think that provides
>> a
>> > much cleaner and easier to comprehend syntax.
>> >
>> > With that agreement, I think this does beg the question of if this
>> should
>> > be a new step (upsertV()) or just an additional signature on the addV()
>> > step? I
>> >
>> > Stephen alluded to this on the dev list and after thinking about this a
>> bit
>> > I think that I am favor of not adding a step and just adding a new
>> > signature to the existing step (if possible).  Thoughts?
>> >
>> >
>> > Dave
>> >
>> > On Wed, Dec 9, 2020 at 4:33 AM Stephen Mallette <[email protected]>
>> > wrote:
>> >
>> > > Josh, thanks for your thoughts - some responses inline:
>> > >
>> > > On Tue, Dec 8, 2020 at 10:16 PM Josh Perryman <[email protected]
>> >
>> > > wrote:
>> > >
>> > > > I'll offer some thoughts. I'm seeing upsertV() as an idempotent
>> > > getOrCreate
>> > > > call which always returns a vertex with the label/property values
>> > > specified
>> > > > within the step. It's sort of a declarative pattern: "return this
>> > vertex
>> > > to
>> > > > me, find it if you can, create it if you must."
>> > > >
>> > >
>> > > I like this description - I've added it to the gist, though it's a
>> bit at
>> > > odds with Dave's previous post, so we'll consider it a temporary
>> addition
>> > > until he responds.
>> > >
>> > >
>> > > > On that account, I do like the simplification in 1. Repetition
>> > shouldn't
>> > > be
>> > > > necessary. In an ideal world, the engine should know the primary
>> > > > identifiers (name or id) and find/create the vertex based on them.
>> Any
>> > > > other included values will be "trued up" as well. But this may be a
>> > > bridge
>> > > > too far for TinkerPop since knowing identifiers may require a
>> specified
>> > > > schema. I'd prefer to omit the third input, but it might be
>> necessary
>> > to
>> > > > keep it so that the second input can be for the matching use case.
>> > > >
>> > >
>> > > In my most recent post on gremlin-users I think I came up with a nice
>> way
>> > > to get rid of the second Map. One Map that forms the full list of
>> > > properties for upserting is easier than partitioning two Maps that
>> > > essentially merge together. I imagine it's unlikely that application
>> code
>> > > will have that separation naturally so users will have the added step
>> of
>> > > trying to separate their data into searchable vs "just data". Getting
>> us
>> > to
>> > > one Map argument will simplify APIs for us and reduce complexity to
>> > users.
>> > > Here is what I'd proposed for those not following over there:
>> > >
>> > > // match on name and age (or perhaps whatever the underlying graph
>> system
>> > > thinks is best?)
>> > > g.upsertV('person', [name:'marko',age:29])
>> > >
>> > > // match on name only
>> > > g.upsertV('person', [name:'marko',age:29]).by('name')
>> > >
>> > > // explicitly match on name and age
>> > > g.upsertV('person', [name:'marko',age:29]).
>> > >  by('name').by('age')
>> > >
>> > > // match on id only
>> > > g.upsertV('person', [(T.id): 100, name:'marko',age:29]).by(T.id)
>> > >
>> > > // match on whatever the by(Traversal) predicate defines
>> > > g.upsertV('person', [name:'marko',age:29]).
>> > >  by(has('name', 'marko'))
>> > >
>> > > // match on id, then update age
>> > > g.upsertV('person', [(T.id): 100, name:'marko']).by(T.id).
>> > >  property('age',29)
>> > >
>> > > With this model, we get one Map argument that represents the complete
>> > > property set to be added/updated to the graph and the user can hint on
>> > what
>> > > key they wish to match on using by() where that sort of step
>> modulation
>> > > should be a well understood and familiar concept in Gremlin at this
>> > point.
>> > >
>> > > So that means I think 2 should always match or update the additional
>> > > > values. Again, we're specifying the expected result and letting the
>> > > engine
>> > > > figure out best how to return that results and appropriately
>> maintain
>> > > > state.
>> > > >
>> > >
>> > > I again like this description, but we'll see what Dave's thoughts are
>> > since
>> > > he's a bit behind on the threads at this point I think.
>> > >
>> > >
>> > > > I'm also presuming that anything not included as inputs to the
>> > upsertV()
>> > > > step are then to be handled by following steps. I'm hoping that is a
>> > > > sufficient approach for addressing the multi/meta property use cases
>> > > > brought up in 3.
>> > > >
>> > >
>> > > yeah................it needs more thought. I spent more time thinking
>> on
>> > > this issue yesterday than I have for all the previous posts combined
>> and
>> > I
>> > > think it yielded something good in that revised syntax. It's going to
>> > take
>> > > more of that kind of elbow grease to dig into these lesser use cases
>> to
>> > > make sure we aren't coding ourselves into corners.
>> > >
>> > >
>> > > > I do like the idea of using modulators (with(), by()) for more
>> > > > sophisticated usage and advanced use cases. Also, the streaming
>> > examples
>> > > > are quite elegant allowing for a helpful separation of data and
>> logic.
>> > > >
>> > >
>> > > cool - hope you like the revised syntax I posted then. :)
>> > >
>> > >
>> > > > That's my humble take. This is a very welcome addition to the
>> language
>> > > and
>> > > > I appreciate the thoughtful & collaborative approach to the design
>> > > > considerations.
>> > > >
>> > >
>> > > Thanks again and please keep the thoughts coming. Lots of other
>> > interesting
>> > > design discussions seem to be brewing.
>> > >
>> > >
>> > > >
>> > > > Josh
>> > > >
>> > > > On Tue, Dec 8, 2020 at 8:57 AM Stephen Mallette <
>> [email protected]>
>> > > > wrote:
>> > > >
>> > > > > I started a expanded this discussion to gremlin-users for a wider
>> > > > audience
>> > > > > and the thread is starting to grow:
>> > > > >
>> > > > >
>> > https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ
>> > > > >
>> > > > > I guess we'll need to summarize that discussion back here now....
>> > > > >
>> > > > > I did have some more thoughts to hang out there and figured that I
>> > > > wouldn't
>> > > > > convolute the discussion on gremlin-users with it so I will
>> continue
>> > > the
>> > > > > discussion here.
>> > > > >
>> > > > > 1, The very first couple of examples seem wrong (or at least not
>> best
>> > > > > demonstrating the usage):
>> > > > >
>> > > > > g.upsertV('person', [name: 'marko'],
>> > > > >                    [name: 'marko', age: 29])
>> > > > > g.upsertV('person', [(T.id): 1],
>> > > > >                    [(T.id): 1, name: 'Marko'])
>> > > > >
>> > > > > should instead be:
>> > > > >
>> > > > > g.upsertV('person', [name: 'marko'],
>> > > > >                    [age: 29])
>> > > > > g.upsertV('person', [(T.id): 1],
>> > > > >                    [name: 'Marko'])
>> > > > >
>> > > > > 2. I can't recall if we made this clear anywhere but in situations
>> > > where
>> > > > we
>> > > > > "get" rather than "create" do the additional properties act in an
>> > > update
>> > > > > fashion to the element that was found? I think I've been working
>> on
>> > the
>> > > > > assumption that it would, though perhaps that is not always
>> > desirable?
>> > > > >
>> > > > > 3. We really never settled up how to deal with
>> multi/meta-properties.
>> > > > That
>> > > > > story should be clear so that when we document upsert() we include
>> > the
>> > > > > approaches for the fallback patterns that don't meet the 90% of
>> use
>> > > cases
>> > > > > we are targeting and I sense that we're saying that
>> > > meta/multi-properties
>> > > > > don't fit in that bucket. So with that in mind, I don't think that
>> > the
>> > > > > following works for metaproperties:
>> > > > >
>> > > > > g.upsertV('person', [(T.id): 1],
>> > > > >                    [name:[acl: 'public'])
>> > > > >
>> > > > > as it doesn't let us set the value for the "name" just the pairs
>> for
>> > > the
>> > > > > meta-properties. I guess a user would have to fall back to:
>> > > > >
>> > > > > g.upsertV('person', [(T.id): 1]).
>> > > > >  property('name','marko','acl','public')
>> > > > >
>> > > > > // or use the additional properties syntax
>> > > > > g.upsertV('person', [(T.id): 1],[name:'marko']).
>> > > > >  properties('name').property('acl','public')
>> > > > >
>> > > > > // or if there were multi-properties then maybe...
>> > > > > g.upsertV('person', [(T.id): 1],[name:'marko']).
>> > > > >  properties('name').hasValue('marko').property('acl','public')
>> > > > >
>> > > > > As for multi-properties, I dont think we should assume that a List
>> > > object
>> > > > > should be interpreted by Gremlin as a multi-property. Perhaps we
>> just
>> > > > rely
>> > > > > on the underlying graph to properly deal with that given a schema
>> or
>> > > the
>> > > > > user falls back to:
>> > > > >
>> > > > > // this ends up as however the graph deals with a List object
>> > > > > g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java'])
>> > > > >
>> > > > > // this is explicit
>> > > > > g.upsertV('person', [(T.id): 1]).
>> > > > >  property(list, 'lang', 'java').
>> > > > >  property(list, 'lang, 'scala').
>> > > > >  property(list, 'lang', 'java')
>> > > > >
>> > > > > If that makes sense to everyone, I will update the gist.
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <
>> > [email protected]
>> > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Hello Stephen,
>> > > > > >
>> > > > > > Thanks for making that gist, its much easier to follow along
>> with
>> > the
>> > > > > > proposed syntax there.  To your specific comments:
>> > > > > >
>> > > > > > #1 - My only worry with the term upsert is that it is not as
>> > widely a
>> > > > > used
>> > > > > > term for this sort of pattern as "Merge" (e.g. SQL, Cypher).
>> > > However I
>> > > > > > don't have a strong opinion on this, so I am fine with either.
>> > > > > > #2 - My only real objective here is to make sure that we make
>> the
>> > > > 80-90%
>> > > > > > case easy and straightforward.  I think that having the fallback
>> > > option
>> > > > > of
>> > > > > > using the current syntax for any complicated edge cases should
>> be
>> > > > > > considered here as well. I'd appreciate your thoughts here as
>> these
>> > > are
>> > > > > > good points you bring up that definitely fall into the 80-90%
>> use
>> > > case.
>> > > > > > #3 - Those points make sense to me, not sure I have anything
>> > further
>> > > to
>> > > > > add
>> > > > > > #4 - I don't think I would expect to have the values extracted
>> from
>> > > the
>> > > > > > traversal filters but I'd be interested in other opinions on
>> that.
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Dave
>> > > > > >
>> > > > > > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette <
>> > > [email protected]
>> > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > It's been a couple weeks since we moved this thread so I went
>> > back
>> > > > > > through
>> > > > > > > it and refreshed my mind with what's there. Dave, nice job
>> > putting
>> > > > all
>> > > > > > > those examples together. I've recollected them all together
>> and
>> > > have
>> > > > > just
>> > > > > > > been reviewing them in a more formatted style with proper
>> Groovy
>> > > > syntax
>> > > > > > > than what can be accomplished in this mailing list. Anyone who
>> > > cares
>> > > > to
>> > > > > > > review in that form can do so here (i've pasted its markdown
>> > > contents
>> > > > > at
>> > > > > > > the bottom of this post as well):
>> > > > > > >
>> > > > > > >
>> > > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31
>> > > > > > >
>> > > > > > > In this light, I have the following comments and thoughts:
>> > > > > > >
>> > > > > > > 1. I feel more inclined toward saving "merge" for a more
>> > > generalized
>> > > > > step
>> > > > > > > of that name and have thus renamed the examples to use
>> "upsert".
>> > > > > > >
>> > > > > > > 2. I still don't quite feel comfortable with
>> > meta/multi-properties
>> > > > > > > examples. Perhaps you could explain further as it's possible
>> I'm
>> > > not
>> > > > > > > thinking things through properly. For meta-properties the
>> example
>> > > is:
>> > > > > > >
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, name:[first: Marko', last:
>> 'R'])
>> > > > > > >
>> > > > > > > So I see that "name" takes a Map here, but how does Gremlin
>> know
>> > if
>> > > > > that
>> > > > > > > means "meta-property" or "Map value" and if the former, then
>> how
>> > do
>> > > > you
>> > > > > > set
>> > > > > > > the value of "name"?  My question is similar to the
>> > multi-property
>> > > > > > examples
>> > > > > > > - using this one:
>> > > > > > >
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1,lang: ['java', 'scala'])
>> > > > > > >
>> > > > > > > How does Gremlin know if the user means that "lang" should be
>> a
>> > > > > > > "multi-property" with Cardinality.list (or Cardinality.set for
>> > that
>> > > > > > matter)
>> > > > > > > or a value of Cardinality.single with a List/Array value? I
>> can
>> > > > perhaps
>> > > > > > > suggest some solutions here but wanted to make sure I wasn't
>> just
>> > > > > > > misunderstanding something.
>> > > > > > >
>> > > > > > > 3. The "upsert with stream" example is:
>> > > > > > >
>> > > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
>> > > > > > >          [(id): 2, (label): 'person', name: 'josh']).
>> > > > > > >  upsertV(values(label), valueMap(id), valueMap())
>> > > > > > >
>> > > > > > > That would establish an API of upsertV(Traversal, Traversal,
>> > > > Traversal)
>> > > > > > > which would enable your final thought of your last post with:
>> > > > > > >
>> > > > > > > g.V().upsertV('person',
>> > __.has('person','age',gt(29)).has('state',
>> > > > > 'NY'),
>> > > > > > >                        [active: true])
>> > > > > > >
>> > > > > > > The open-ended nature of that is neat but comes with some
>> > > complexity
>> > > > > with
>> > > > > > > steps that traditionally take an "open" Traversal (and this
>> step
>> > > now
>> > > > > has
>> > > > > > up
>> > > > > > > to three of them) - perhaps, that's our own fault in some
>> ways.
>> > For
>> > > > the
>> > > > > > > "stream" concept, values()/valueMap() won't immediately work
>> that
>> > > way
>> > > > > as
>> > > > > > > they only apply to Element...given today's Gremlin semantics,
>> I
>> > > think
>> > > > > > we'd
>> > > > > > > re-write that as:
>> > > > > > >
>> > > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
>> > > > > > >          [(id): 2, (label): 'person', name: 'josh']).
>> > > > > > >  upsertV(select(label), select(id), select(id,label,'name'))
>> > > > > > >
>> > > > > > > Though, even that doesn't quite work because you can't
>> select(T)
>> > > > right
>> > > > > > now
>> > > > > > > which means you'd need to go with:
>> > > > > > >
>> > > > > > > g.inject([id: 1, label: 'person', name: 'marko'],
>> > > > > > >          [id: 2, label: 'person', name: 'josh']).
>> > > > > > >  upsertV(select('label'), select('id'),
>> > > select('id','label','name'))
>> > > > > > >
>> > > > > > > not terrible in a way and perhaps we could fix select(T) -
>> can't
>> > > > > remember
>> > > > > > > why that isn't allowed except that select() operates over
>> > multiple
>> > > > > scopes
>> > > > > > > and perhaps trying T through those scopes causes trouble.
>> > > > > > >
>> > > > > > > 4. Continuing with the use of Traversal as arguments, let's go
>> > back
>> > > > to:
>> > > > > > >
>> > > > > > > g.V().upsertV('person',
>> > __.has('person','age',gt(29)).has('state',
>> > > > > 'NY'),
>> > > > > > >                        [active: true])
>> > > > > > >
>> > > > > > > I originally had it in my mind to prefer this approach over a
>> Map
>> > > for
>> > > > > the
>> > > > > > > search as it provides a great deal of flexibility and fits the
>> > > common
>> > > > > > > filtering patterns that users follow really well. I suppose
>> the
>> > > > > question
>> > > > > > is
>> > > > > > > what to do in the situation of "create". Would we extract all
>> the
>> > > > > strict
>> > > > > > > equality filter values from the initial has(), thus, in the
>> > > example,
>> > > > > > > "state" but not "age", and create a vertex that has "state=NY,
>> > > > > > > active=true"? That is possible with how things are designed in
>> > > > Gremlin
>> > > > > > > today I think.
>> > > > > > >
>> > > > > > > This syntax does create some conflict with the subject of 3
>> and
>> > > > streams
>> > > > > > > because the traverser in the streams case is the incoming Map
>> but
>> > > for
>> > > > > > this
>> > > > > > > context it's meant as a filter of V(). Getting too whacky?
>> > > > > > >
>> > > > > > > 5. I like the idea of using a side-effect to capture whether
>> the
>> > > > > element
>> > > > > > > was created or not. that makes sense. I think that can work.
>> > > > > > >
>> > > > > > > ==============================
>> > > > > > > = gist contents
>> > > > > > > ==============================
>> > > > > > >
>> > > > > > > # API
>> > > > > > >
>> > > > > > > ```java
>> > > > > > > upsertV(String label, Map matchOrCreateProperties)
>> > > > > > > upsertV(String label, Map matchOrCreateProperties, Map
>> > > > > > > additionalProperties)
>> > > > > > > upsertV(Traversal label, Traversal matchOrCreateProperties)
>> > > > > > > upsertV(Traversal label, Traversal matchOrCreateProperties,
>> > > Traversal
>> > > > > > > additionalProperties)
>> > > > > > > upsertV(...).
>> > > > > > >  with(WithOptions.sideEffectLabel, 'a')
>> > > > > > >
>> > > > > > > upsertE(String label, Map matchOrCreateProperties)
>> > > > > > > upsertE(Traversal label, Traversal matchOrCreateProperties)
>> > > > > > > upsertE(String label, Map matchOrCreateProperties, Map
>> > > > > > > additionalProperties)
>> > > > > > > upsertE(Traversal label, Traversal matchOrCreateProperties,
>> > > Traversal
>> > > > > > > additionalProperties)
>> > > > > > > upsertE(...).
>> > > > > > >  from(Vertex incidentOut).to(Vertex incidentIn)
>> > > > > > >  with(WithOptions.sideEffectLabel, 'a')
>> > > > > > > ```
>> > > > > > >
>> > > > > > > # Examples
>> > > > > > >
>> > > > > > > ## upsert
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [name: 'marko'],
>> > > > > > >                    [name: 'marko', age: 29])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with id
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, name: 'Marko'])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with Meta Properties
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, name:[first: Marko', last:
>> 'R'])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with Multi Properties
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1,lang: ['java', 'scala'])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with Single Cardinality
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, lang: 'java')
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with List Cardinality
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, lang: ['java', 'scala',
>> 'java'])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with Set Cardinality
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, lang: ['java','scala'])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with List Cardinality - and add value to list
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [(T.id): 1],
>> > > > > > >                    [(T.id): 1, lang: ['java','scala','java']).
>> > > > > > >  property(Cardinality.list, 'lang', 'java')
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with stream
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > // doesn't work with today's Gremlin semantics
>> > > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'],
>> > > > > > >          [(id): 2, (label): 'person', name: 'josh']).
>> > > > > > >  upsertV(values(label), valueMap(id), valueMap())
>> > > > > > >
>> > > > > > > // the following would be more in line with what's possible
>> with
>> > > > > existing
>> > > > > > > Gremlin semantics:
>> > > > > > > g.inject([id: 1, label: 'person', name: 'marko'],
>> > > > > > >          [id: 2, label: 'person', name: 'josh']).
>> > > > > > >  upsertV(select('label'), select('id'),
>> > > select('id','label','name'))
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert with reporting added or updated side effect
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertV('person', [name: 'marko'],
>> > > > > > >                    [name: 'marko', age: 29]).
>> > > > > > >    with(WithOptions.sideEffectLabel, 'a').
>> > > > > > >  project('vertex', 'added').
>> > > > > > >    by().
>> > > > > > >    by(cap('a'))
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert Edge assuming a self-relation of "marko"
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.V().has('person','name','marko').
>> > > > > > >  upsertE( 'self', [weight:0.5])
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert Edge with incident vertex checks
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.upsertE('knows', [weight:0.5]).
>> > > > > > >    from(V(1)).to(V(2))
>> > > > > > > g.V().has('person','name','marko').
>> > > > > > >  upsertE( 'knows', [weight:0.5]).
>> > > > > > >    to(V(2))
>> > > > > > > ```
>> > > > > > >
>> > > > > > > ## upsert using a Traversal for the match
>> > > > > > >
>> > > > > > > ```groovy
>> > > > > > > g.V().upsertV('person',
>> > __.has('person','age',gt(29)).has('state',
>> > > > > 'NY'),
>> > > > > > >                        [active: true])
>> > > > > > > ```
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger <
>> > > > [email protected]>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > So, I've been doing some additional thinking about the ways
>> > that
>> > > > this
>> > > > > > > could
>> > > > > > > > work based on the comments below and have put my comments
>> > inline.
>> > > > > > > >
>> > > > > > > > Dave
>> > > > > > > >
>> > > > > > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette <
>> > > > > [email protected]
>> > > > > > >
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > I added some thoughts inline below:
>> > > > > > > > >
>> > > > > > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger <
>> > > > > > [email protected]>
>> > > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Thanks for the detailed comments Stephen.  I have
>> addressed
>> > > > them
>> > > > > > > inline
>> > > > > > > > > > below.
>> > > > > > > > > >
>> > > > > > > > > > I did read the proposal from earlier and I think that we
>> > are
>> > > in
>> > > > > > close
>> > > > > > > > > > agreement with what we are trying to accomplish.  I also
>> > > fully
>> > > > > > > support
>> > > > > > > > > > Josh's comment on providing a mechanism for submitting a
>> > map
>> > > of
>> > > > > > > > > properties
>> > > > > > > > > > as manually unrolling this all right now leads to a lot
>> of
>> > > > > > potential
>> > > > > > > > for
>> > > > > > > > > > error and a long messy traversal.
>> > > > > > > > > >
>> > > > > > > > > > I'm looking forward to this discussion on how to merge
>> > these
>> > > > two
>> > > > > > > > > proposals.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > 1. How would multi/meta-properties fit into the API
>> you've
>> > > > > > proposed?
>> > > > > > > > > >
>> > > > > > > > > > My first thought here is that multi-properties would be
>> > > > > represented
>> > > > > > > as
>> > > > > > > > > > lists in the map, e.g.
>> > > > > > > > > >
>> > > > > > > > > > {names: ['Dave', 'David']}
>> > > > > > > > > >
>> > > > > > > > > > and meta-properties would be represented as maps in the
>> > maps,
>> > > > > e.g.
>> > > > > > > > > >
>> > > > > > > > > > {name: {first: 'Dave', last: 'Bechberger'}}
>> > > > > > > > > >
>> > > > > > > > > > I can't say I've thought through all the implications of
>> > this
>> > > > > > though
>> > > > > > > so
>> > > > > > > > > it
>> > > > > > > > > > is an area we would need to explore.
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > The first implication that comes to mind is that it makes
>> the
>> > > > > > > assumption
>> > > > > > > > > the user wants multi/meta-properties as opposed to a
>> single
>> > > > > > cardinality
>> > > > > > > > of
>> > > > > > > > > List in the first case and a Map as a property value in
>> the
>> > > > second
>> > > > > > > case.
>> > > > > > > > I
>> > > > > > > > > suppose that graphs with a schema could resolve those
>> > > assumptions
>> > > > > but
>> > > > > > > > > graphs that are schemaless would have a problem. The issue
>> > > could
>> > > > be
>> > > > > > > > > resolved by specialized configuration of "g" or per
>> merge()
>> > > step
>> > > > > > using
>> > > > > > > a
>> > > > > > > > > with() modulator I suppose but that goes into a yet
>> another
>> > > level
>> > > > > of
>> > > > > > > > > implications to consider. I've often wondered if the start
>> > > point
>> > > > > for
>> > > > > > > > > getting types/schema into TP3 without a full rewrite
>> would be
>> > > in
>> > > > > this
>> > > > > > > > form
>> > > > > > > > > where Gremlin would be given hints as to what to expect
>> as to
>> > > the
>> > > > > > types
>> > > > > > > > of
>> > > > > > > > > data it might encounter while traversing. Anyway, I'd be
>> > > hesitant
>> > > > > to
>> > > > > > go
>> > > > > > > > > down paths that don't account for multi/metaproperties
>> well.
>> > > > They
>> > > > > > are
>> > > > > > > > > first class citizens in TP3 (with those hoping for
>> extension
>> > of
>> > > > at
>> > > > > > > least
>> > > > > > > > > multiproperties to edges) and while I find them a constant
>> > > > > annoyance
>> > > > > > > for
>> > > > > > > > so
>> > > > > > > > > many reasons, we're kinda stuck with them.
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > I agree we need to account for multi/meta properties as 1st
>> > class
>> > > > > > > > citizens.  This is my current thinking on the syntax for the
>> > > > > situations
>> > > > > > > we
>> > > > > > > > have laid out so far:
>> > > > > > > >
>> > > > > > > > *Merge*
>> > > > > > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age':
>> > 29})
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1,
>> 'name':
>> > > > > 'Marko'})
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1},
>> {T.id:
>> > > 1,
>> > > > > > > 'name':
>> > > > > > > > {'first': Marko', 'last': 'R'})
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1},
>> > {T.id:
>> > > > 1,
>> > > > > > > > 'lang': ['java', 'scala'])
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1},
>> > > {T.id:
>> > > > 1,
>> > > > > > > > 'lang': 'java')
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1},
>> > {T.id:
>> > > 1,
>> > > > > > > 'lang':
>> > > > > > > > ['java', 'scala', 'java'])
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1},
>> > {T.id:
>> > > 1,
>> > > > > > > 'lang':
>> > > > > > > > ['java', 'scala'])
>> > > > > > > >
>> > > > > > > > Since in a mergeV() scenario we are only ever adding
>> whatever
>> > > > values
>> > > > > > are
>> > > > > > > > passed in there would be no need to specify the cardinality
>> of
>> > > the
>> > > > > > > property
>> > > > > > > > being added.  If they wanted to add a value to an existing
>> > > property
>> > > > > > then
>> > > > > > > > the current property() method would still be available on
>> the
>> > > > output
>> > > > > > > > traversal. e.g.
>> > > > > > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java',
>> 'scala',
>> > > > > > > > 'java']).property(Cardinality.list, 'lang, 'java')
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with stream * g.inject([{id: 1, label: 'person',
>> 'name':
>> > > > > > 'marko'},
>> > > > > > > > {id: 2, label: 'person', 'name': 'josh'}]).
>> > > > > > > >  mergeV(values('label'), valueMap('id'), valueMap())
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > *Merge with reporting added or updated side effect*
>> > > > g.mergeV('name',
>> > > > > > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}).
>> > > > > > > >  with(WithOptions.sideEffectLabel, 'a').
>> > > > > > > >  project('vertex', 'added').
>> > > > > > > >    by().
>> > > > > > > >    by(cap('a'))
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > 2. How would users set the T.id on creation? would that
>> > T.id
>> > > > just
>> > > > > > be
>> > > > > > > a
>> > > > > > > > > key
>> > > > > > > > > > in the first Map argument?
>> > > > > > > > > >
>> > > > > > > > > > Yes, I was thinking that T.id would be the key name,
>> e.g.:
>> > > > > > > > > >
>> > > > > > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'})
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > ok - I can't say I see a problem with that atm.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > 3. I do like the general idea of a match on multiple
>> > > properties
>> > > > > for
>> > > > > > > the
>> > > > > > > > > > first argument as a convenience but wonder about the
>> > > > specificity
>> > > > > of
>> > > > > > > > this
>> > > > > > > > > > API a bit as it focuses heavily on equality - I suppose
>> > > that's
>> > > > > most
>> > > > > > > > cases
>> > > > > > > > > > for get-or-create, so perhaps that's ok.
>> > > > > > > > > >
>> > > > > > > > > > In most cases I've seen use exact matches on the vertex
>> or
>> > > > > edge.  I
>> > > > > > > > think
>> > > > > > > > > > it might be best to keep this straightforward as any
>> > complex
>> > > > edge
>> > > > > > > cases
>> > > > > > > > > > still can perform the same functionality using the
>> > coalesce()
>> > > > > > > pattern.
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > I played around with the idea to use modulators to apply
>> > > > additional
>> > > > > > > > > constraints:
>> > > > > > > > >
>> > > > > > > > > g.mergeV('person', {'state':'NY'}, {'active': true}).
>> > > > > > > > >    by(has('age', gt(29))
>> > > > > > > > >
>> > > > > > > > > I suppose that's neat but maybe not...we could easily get
>> the
>> > > > same
>> > > > > > > thing
>> > > > > > > > > from:
>> > > > > > > > >
>> > > > > > > > > g.V().has('person','age',gt(29)).
>> > > > > > > > >  mergeV('person', {'state':'NY'}, {'active': true}).
>> > > > > > > > >
>> > > > > > > > > which is more readable though it does make it harder for
>> > > > providers
>> > > > > to
>> > > > > > > > > optimize upsert operations if they could make use of the
>> > has()
>> > > as
>> > > > > > part
>> > > > > > > of
>> > > > > > > > > that. I think I like has() as part of the main query
>> rather
>> > > than
>> > > > > in a
>> > > > > > > > by()
>> > > > > > > > > - just thinking out loud.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > Another potential option here would be to allow the second
>> > > > parameter
>> > > > > of
>> > > > > > > > mergeV() accept a traversal.  Not sure how much complexity
>> that
>> > > > would
>> > > > > > add
>> > > > > > > > though
>> > > > > > > >
>> > > > > > > > g.V().mergeV('person',
>> > __.has('person','age',gt(29)).has('state',
>> > > > > > 'NY'}.,
>> > > > > > > > {'active': true}).
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > 4. I think your suggestion points to one of the troubles
>> > > > Gremlin
>> > > > > > has
>> > > > > > > > > which
>> > > > > > > > > > we see with "algorithms" - extending the language with
>> new
>> > > > steps
>> > > > > > that
>> > > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up
>> > with
>> > > > > > > > > > shortestPath() step) pollutes the core language a bit,
>> > hence
>> > > my
>> > > > > > > > > > generalization of "merging" in my link above which fits
>> > into
>> > > > the
>> > > > > > core
>> > > > > > > > > > Gremlin language style. There is a bigger picture where
>> we
>> > > are
>> > > > > > > missing
>> > > > > > > > > > something in Gremlin that lets us extend the language in
>> > ways
>> > > > > that
>> > > > > > > let
>> > > > > > > > us
>> > > > > > > > > > easily introduce new steps that aren't for general
>> purpose.
>> > > > This
>> > > > > > > issue
>> > > > > > > > is
>> > > > > > > > > > discussed in terms of "algorithms" here:
>> > > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991
>> but I
>> > > > think
>> > > > > > we
>> > > > > > > > > could
>> > > > > > > > > > see how there might be some "mutation" extension steps
>> that
>> > > > would
>> > > > > > > cover
>> > > > > > > > > > your suggested API, plus batch operations, etc. We need
>> a
>> > way
>> > > > to
>> > > > > > add
>> > > > > > > > > > "sugar" without it interfering with the consistency of
>> the
>> > > > core.
>> > > > > > > > > Obviously
>> > > > > > > > > > this is a bigger issue but perhaps important to solve to
>> > > > > implement
>> > > > > > > > steps
>> > > > > > > > > in
>> > > > > > > > > > the fashion you describe.
>> > > > > > > > > >
>> > > > > > > > > > I agree that the "algorithm" steps do seem a bit odd in
>> the
>> > > > core
>> > > > > > > > language
>> > > > > > > > > > but I think the need is there.  I'd be interested in
>> > > furthering
>> > > > > > this
>> > > > > > > > > > discussion but I think these "pattern" steps may or may
>> not
>> > > be
>> > > > > the
>> > > > > > > same
>> > > > > > > > > as
>> > > > > > > > > > the algorithm steps.  I'll have to think on that.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to
>> > > > specify
>> > > > > > what
>> > > > > > > > > > element type the first Map argument should be applied
>> to?
>> > > what
>> > > > > > about
>> > > > > > > > > > mergeVP (i.e. vertex property as it too is an element) ?
>> > > That's
>> > > > > > > tricky
>> > > > > > > > > but
>> > > > > > > > > > I don't think we should miss that. Perhaps merge() could
>> > be a
>> > > > > > > "complex
>> > > > > > > > > > modulator"?? that's a new concept of course, but you
>> would
>> > do
>> > > > > > > > > g.V().merge()
>> > > > > > > > > > and the label and first Map would fold to
>> VertexStartStep
>> > > (i.e.
>> > > > > > V())
>> > > > > > > > for
>> > > > > > > > > > the lookup and then a MergeStep would follow - thus a
>> > > "complex
>> > > > > > > > modulator"
>> > > > > > > > > > as it does more than just change the behavior of the
>> > previous
>> > > > > step
>> > > > > > -
>> > > > > > > it
>> > > > > > > > > > also adds its own. I suppose it could also add has()
>> steps
>> > > > > followed
>> > > > > > > by
>> > > > > > > > > the
>> > > > > > > > > > MergeStep and then the has() operations would fold in
>> > > normally
>> > > > as
>> > > > > > > they
>> > > > > > > > do
>> > > > > > > > > > today. In this way, we can simplify to just one single
>> > > > > > > > > > merge(String,Map,Map). ??
>> > > > > > > > > >
>> > > > > > > > > > I agree that we should also think about how to include
>> > > > properties
>> > > > > > in
>> > > > > > > > this
>> > > > > > > > > > merge construct.  The reason I was thinking about
>> mergeV()
>> > > and
>> > > > > > > mergeE()
>> > > > > > > > > is
>> > > > > > > > > > that it follows the same pattern as the already well
>> > > understood
>> > > > > > > > > > addV()/addE() steps.  I am a bit wary of trying to
>> > generalize
>> > > > > this
>> > > > > > > down
>> > > > > > > > > to
>> > > > > > > > > > a single merge() step as these sorts of complex
>> overloads
>> > > make
>> > > > it
>> > > > > > > hard
>> > > > > > > > to
>> > > > > > > > > > figure out which Gremlin step you should use for a
>> > particular
>> > > > > > pattern
>> > > > > > > > > (e.g.
>> > > > > > > > > > upsert vertex or upsert edge).  One thing that I think
>> > > > customers
>> > > > > > find
>> > > > > > > > > > useful about the addV/addE step is that they are very
>> > > > > discoverable,
>> > > > > > > the
>> > > > > > > > > > name tells me what functionality to expect from that
>> step.
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > I'm starting to agree with the idea that we have to do
>> > > something
>> > > > > like
>> > > > > > > > > mergeV()/E() as it wouldn't quite work as a start step
>> > > otherwise.
>> > > > > We
>> > > > > > > > > couldn't do:
>> > > > > > > > >
>> > > > > > > > > g.merge()
>> > > > > > > > >
>> > > > > > > > > as we wouldnt know what sort of Element it applied to. If
>> > so, I
>> > > > > > wonder
>> > > > > > > if
>> > > > > > > > > it would be better to preserve "merge" for my more general
>> > > > use-case
>> > > > > > and
>> > > > > > > > > prefer upsertV()/E()?
>> > > > > > > > >
>> > > > > > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps
>> it
>> > is
>> > > an
>> > > > > > > > uncommon
>> > > > > > > > > case (compared to V/E()) and we don't really have addVP()
>> as
>> > an
>> > > > > > > analogous
>> > > > > > > > > step. Perhaps existing coalesce() patterns and/or my more
>> > > general
>> > > > > > > purpose
>> > > > > > > > > merge() step would satisfy those situations??
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > I think that a mergeVP() type step may require a different
>> > > pattern
>> > > > > > since
>> > > > > > > it
>> > > > > > > > would really need to accept a map of values otherwise it
>> would
>> > > > > > > essentially
>> > > > > > > > perform the same function as property()
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > 6. One thing neither my approach nor yours seems to do
>> is
>> > > tell
>> > > > > the
>> > > > > > > user
>> > > > > > > > > if
>> > > > > > > > > > they created something or updated something - that's
>> > another
>> > > > > thing
>> > > > > > > I've
>> > > > > > > > > > seen users want to have in get-or-create. Here again we
>> go
>> > > > deeper
>> > > > > > > into
>> > > > > > > > a
>> > > > > > > > > > less general step specification as alluded to in 4, but
>> a
>> > > > merge()
>> > > > > > > step
>> > > > > > > > as
>> > > > > > > > > > proposed in 5, might return [Element,boolean] so as to
>> > > provide
>> > > > an
>> > > > > > > > > indicator
>> > > > > > > > > > of creation?
>> > > > > > > > > >
>> > > > > > > > > > Hmm, I'll have to think about that.  How would returning
>> > > > multiple
>> > > > > > > > values
>> > > > > > > > > > work if we want to chain these together. e.g. Add a
>> vertex
>> > > > > between
>> > > > > > > two
>> > > > > > > > > > edges but make sure the vertices exist?
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > Added a thought on how to accomplish this above that I'd
>> like
>> > to
>> > > > get
>> > > > > > > > thoughts on.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > 7. You were just introducing your ideas here, so perhaps
>> > you
>> > > > > > haven't
>> > > > > > > > > gotten
>> > > > > > > > > > this far yet, but a shortcoming to doing
>> > > merge(String,Map,Map)
>> > > > is
>> > > > > > > that
>> > > > > > > > it
>> > > > > > > > > > leaves open no opportunity to stream a List of Maps to a
>> > > > merge()
>> > > > > > for
>> > > > > > > a
>> > > > > > > > > form
>> > > > > > > > > > of batch loading which is mighty common and one of the
>> > > > variations
>> > > > > > of
>> > > > > > > > the
>> > > > > > > > > > coalesce() pattern that I alluded to at the start of all
>> > > this.
>> > > > I
>> > > > > > > think
>> > > > > > > > > that
>> > > > > > > > > > we would want to be sure that we left open the option
>> to do
>> > > > that
>> > > > > > > > somehow.
>> > > > > > > > > > 8. If we had a general purpose merge() step I wonder if
>> it
>> > > > makes
>> > > > > > > > > developing
>> > > > > > > > > > the API as you suggested easier to do?
>> > > > > > > > > >
>> > > > > > > > > > Hmm, let me think about that one.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > I will add an item 8 to think about which I didn't mention
>> > > > before:
>> > > > > > > > >
>> > > > > > > > > 8. The signature you suggested for mergeE() is:
>> > > > > > > > >
>> > > > > > > > > mergeE(String, Map, Map)
>> > > > > > > > >      String - The edge label
>> > > > > > > > >      Map (first) - The properties to match existing edge
>> on
>> > > > > > > > >      Map (second) - Any additional properties to set if a
>> new
>> > > > edge
>> > > > > is
>> > > > > > > > > created (optional)
>> > > > > > > > >
>> > > > > > > > > but does that exactly answer the need? Typically the idea
>> is
>> > to
>> > > > > > detect
>> > > > > > > an
>> > > > > > > > > edge between two vertices not globally in the graph by
>> way of
>> > > > some
>> > > > > > > > > properties. This signature doesn't seem to allow for that
>> as
>> > it
>> > > > > > doesn't
>> > > > > > > > > allow specification of the vertices to test against.
>> Perhaps
>> > > the
>> > > > > > answer
>> > > > > > > > is
>> > > > > > > > > to use from() and to() modulators?
>> > > > > > > > >
>> > > > > > > > > g.mergeE('knows', {'weight':0.5}).
>> > > > > > > > >  from(V(1)).to(V(2))
>> > > > > > > > > g.V().has('person','name','marko').
>> > > > > > > > >  mergeE( 'knows', {'weight':0.5}).
>> > > > > > > > >    to(V(2))
>> > > > > > > > > g.V().has('person','name','marko').
>> > > > > > > > >  mergeE( 'self', {'weight':0.5})
>> > > > > > > > >
>> > > > > > > > > That seems to work?
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > I think that is an elegant solution to that problem.  I
>> like it
>> > > and
>> > > > > it
>> > > > > > > > keeps in line with the way that addE() works.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette <
>> > > > > > > [email protected]
>> > > > > > > > >
>> > > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > I like our coalesce() pattern but it is verbose and
>> over
>> > > time
>> > > > > it
>> > > > > > > has
>> > > > > > > > > gone
>> > > > > > > > > > > from a simple pattern to one with numerous variations
>> for
>> > > all
>> > > > > > > manner
>> > > > > > > > of
>> > > > > > > > > > > different sorts of merge-like operations. As such, I
>> do
>> > > think
>> > > > > we
>> > > > > > > > should
>> > > > > > > > > > > introduce something to cover this pattern.
>> > > > > > > > > > >
>> > > > > > > > > > > I like that you used the word "merge" in your
>> description
>> > > of
>> > > > > this
>> > > > > > > as
>> > > > > > > > it
>> > > > > > > > > > is
>> > > > > > > > > > > the word I've liked using. You might want to give a
>> look
>> > at
>> > > > my
>> > > > > > > > proposed
>> > > > > > > > > > > merge() step from earlier in the year:
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E
>> > > > > > > > > > >
>> > > > > > > > > > > I'm just going to dump thoughts as they come regarding
>> > what
>> > > > you
>> > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > 1. How would multi/meta-properties fit into the API
>> > you've
>> > > > > > > proposed?
>> > > > > > > > > > > 2. How would users set the T.id on creation? would
>> that
>> > > T.id
>> > > > > just
>> > > > > > > be
>> > > > > > > > a
>> > > > > > > > > > key
>> > > > > > > > > > > in the first Map argument?
>> > > > > > > > > > > 3. I do like the general idea of a match on multiple
>> > > > properties
>> > > > > > for
>> > > > > > > > the
>> > > > > > > > > > > first argument as a convenience but wonder about the
>> > > > > specificity
>> > > > > > of
>> > > > > > > > > this
>> > > > > > > > > > > API a bit as it focuses heavily on equality - I
>> suppose
>> > > > that's
>> > > > > > most
>> > > > > > > > > cases
>> > > > > > > > > > > for get-or-create, so perhaps that's ok.
>> > > > > > > > > > > 4. I think your suggestion points to one of the
>> troubles
>> > > > > Gremlin
>> > > > > > > has
>> > > > > > > > > > which
>> > > > > > > > > > > we see with "algorithms" - extending the language with
>> > new
>> > > > > steps
>> > > > > > > that
>> > > > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end
>> up
>> > > with
>> > > > > > > > > > > shortestPath() step) pollutes the core language a bit,
>> > > hence
>> > > > my
>> > > > > > > > > > > generalization of "merging" in my link above which
>> fits
>> > > into
>> > > > > the
>> > > > > > > core
>> > > > > > > > > > > Gremlin language style. There is a bigger picture
>> where
>> > we
>> > > > are
>> > > > > > > > missing
>> > > > > > > > > > > something in Gremlin that lets us extend the language
>> in
>> > > ways
>> > > > > > that
>> > > > > > > > let
>> > > > > > > > > us
>> > > > > > > > > > > easily introduce new steps that aren't for general
>> > purpose.
>> > > > > This
>> > > > > > > > issue
>> > > > > > > > > is
>> > > > > > > > > > > discussed in terms of "algorithms" here:
>> > > > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991
>> > but I
>> > > > > think
>> > > > > > > we
>> > > > > > > > > > could
>> > > > > > > > > > > see how there might be some "mutation" extension steps
>> > that
>> > > > > would
>> > > > > > > > cover
>> > > > > > > > > > > your suggested API, plus batch operations, etc. We
>> need a
>> > > way
>> > > > > to
>> > > > > > > add
>> > > > > > > > > > > "sugar" without it interfering with the consistency of
>> > the
>> > > > > core.
>> > > > > > > > > > Obviously
>> > > > > > > > > > > this is a bigger issue but perhaps important to solve
>> to
>> > > > > > implement
>> > > > > > > > > steps
>> > > > > > > > > > in
>> > > > > > > > > > > the fashion you describe.
>> > > > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is
>> to
>> > > > > specify
>> > > > > > > what
>> > > > > > > > > > > element type the first Map argument should be applied
>> to?
>> > > > what
>> > > > > > > about
>> > > > > > > > > > > mergeVP (i.e. vertex property as it too is an
>> element) ?
>> > > > That's
>> > > > > > > > tricky
>> > > > > > > > > > but
>> > > > > > > > > > > I don't think we should miss that. Perhaps merge()
>> could
>> > > be a
>> > > > > > > > "complex
>> > > > > > > > > > > modulator"?? that's a new concept of course, but you
>> > would
>> > > do
>> > > > > > > > > > g.V().merge()
>> > > > > > > > > > > and the label and first Map would fold to
>> VertexStartStep
>> > > > (i.e.
>> > > > > > > V())
>> > > > > > > > > for
>> > > > > > > > > > > the lookup and then a MergeStep would follow - thus a
>> > > > "complex
>> > > > > > > > > modulator"
>> > > > > > > > > > > as it does more than just change the behavior of the
>> > > previous
>> > > > > > step
>> > > > > > > -
>> > > > > > > > it
>> > > > > > > > > > > also adds its own. I suppose it could also add has()
>> > steps
>> > > > > > followed
>> > > > > > > > by
>> > > > > > > > > > the
>> > > > > > > > > > > MergeStep and then the has() operations would fold in
>> > > > normally
>> > > > > as
>> > > > > > > > they
>> > > > > > > > > do
>> > > > > > > > > > > today. In this way, we can simplify to just one single
>> > > > > > > > > > > merge(String,Map,Map). ??
>> > > > > > > > > > > 6. One thing neither my approach nor yours seems to
>> do is
>> > > > tell
>> > > > > > the
>> > > > > > > > user
>> > > > > > > > > > if
>> > > > > > > > > > > they created something or updated something - that's
>> > > another
>> > > > > > thing
>> > > > > > > > I've
>> > > > > > > > > > > seen users want to have in get-or-create. Here again
>> we
>> > go
>> > > > > deeper
>> > > > > > > > into
>> > > > > > > > > a
>> > > > > > > > > > > less general step specification as alluded to in 4,
>> but a
>> > > > > merge()
>> > > > > > > > step
>> > > > > > > > > as
>> > > > > > > > > > > proposed in 5, might return [Element,boolean] so as to
>> > > > provide
>> > > > > an
>> > > > > > > > > > indicator
>> > > > > > > > > > > of creation?
>> > > > > > > > > > > 7. You were just introducing your ideas here, so
>> perhaps
>> > > you
>> > > > > > > haven't
>> > > > > > > > > > gotten
>> > > > > > > > > > > this far yet, but a shortcoming to doing
>> > > > merge(String,Map,Map)
>> > > > > is
>> > > > > > > > that
>> > > > > > > > > it
>> > > > > > > > > > > leaves open no opportunity to stream a List of Maps
>> to a
>> > > > > merge()
>> > > > > > > for
>> > > > > > > > a
>> > > > > > > > > > form
>> > > > > > > > > > > of batch loading which is mighty common and one of the
>> > > > > variations
>> > > > > > > of
>> > > > > > > > > the
>> > > > > > > > > > > coalesce() pattern that I alluded to at the start of
>> all
>> > > > this.
>> > > > > I
>> > > > > > > > think
>> > > > > > > > > > that
>> > > > > > > > > > > we would want to be sure that we left open the option
>> to
>> > do
>> > > > > that
>> > > > > > > > > somehow.
>> > > > > > > > > > > 8. If we had a general purpose merge() step I wonder
>> if
>> > it
>> > > > > makes
>> > > > > > > > > > developing
>> > > > > > > > > > > the API as you suggested easier to do?
>> > > > > > > > > > >
>> > > > > > > > > > > I think I'd like to solve the problems you describe in
>> > your
>> > > > > post
>> > > > > > as
>> > > > > > > > > well
>> > > > > > > > > > as
>> > > > > > > > > > > the ones in mine. There is some relation there, but
>> gaps
>> > as
>> > > > > well.
>> > > > > > > > With
>> > > > > > > > > > more
>> > > > > > > > > > > discussion here we can figure something out.
>> > > > > > > > > > >
>> > > > > > > > > > > Thanks for starting this talk - good one!
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <
>> > > > > > > > [email protected]>
>> > > > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > I've had a few on and off discussions with a few
>> people
>> > > > here,
>> > > > > > so
>> > > > > > > I
>> > > > > > > > > > wanted
>> > > > > > > > > > > > to send this out to everyone for feedback.
>> > > > > > > > > > > >
>> > > > > > > > > > > > What are people's thoughts on creating a new set of
>> > steps
>> > > > > that
>> > > > > > > > codify
>> > > > > > > > > > > > common Gremlin best practices?
>> > > > > > > > > > > >
>> > > > > > > > > > > > I think there are several common Gremlin patterns
>> where
>> > > > users
>> > > > > > > would
>> > > > > > > > > > > benefit
>> > > > > > > > > > > > from the additional guidance that these codified
>> steps
>> > > > > > represent.
>> > > > > > > > > The
>> > > > > > > > > > > > first one I would recommend though is codifying the
>> > > element
>> > > > > > > > existence
>> > > > > > > > > > > > pattern into a single Gremlin step, something like:
>> > > > > > > > > > > >
>> > > > > > > > > > > > mergeV(String, Map, Map)
>> > > > > > > > > > > >      String - The vertex label
>> > > > > > > > > > > >      Map (first) - The properties to match existing
>> > > > vertices
>> > > > > on
>> > > > > > > > > > > >      Map (second) - Any additional properties to set
>> > if a
>> > > > new
>> > > > > > > > vertex
>> > > > > > > > > is
>> > > > > > > > > > > > created (optional)
>> > > > > > > > > > > > mergeE(String, Map, Map)
>> > > > > > > > > > > >      String - The edge label
>> > > > > > > > > > > >      Map (first) - The properties to match existing
>> > edge
>> > > on
>> > > > > > > > > > > >      Map (second) - Any additional properties to set
>> > if a
>> > > > new
>> > > > > > > edge
>> > > > > > > > is
>> > > > > > > > > > > > created (optional)
>> > > > > > > > > > > >
>> > > > > > > > > > > > In each of these cases these steps would perform the
>> > same
>> > > > > > upsert
>> > > > > > > > > > > > functionality as the element existence pattern.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Example:
>> > > > > > > > > > > >
>> > > > > > > > > > > > g.V().has('person','name','stephen').
>> > > > > > > > > > > >            fold().
>> > > > > > > > > > > >            coalesce(unfold(),
>> > > > > > > > > > > >                    addV('person').
>> > > > > > > > > > > >                      property('name','stephen').
>> > > > > > > > > > > >                      property('age',34))
>> > > > > > > > > > > >
>> > > > > > > > > > > > would become:
>> > > > > > > > > > > >
>> > > > > > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34})
>> > > > > > > > > > > >
>> > > > > > > > > > > > I think that this change would be a good addition to
>> > the
>> > > > > > language
>> > > > > > > > for
>> > > > > > > > > > > > several reasons:
>> > > > > > > > > > > >
>> > > > > > > > > > > > * This codifies the best practice for a specific
>> > > > > action/recipe,
>> > > > > > > > which
>> > > > > > > > > > > > reduces the chance that someone uses the pattern
>> > > > incorrectly
>> > > > > > > > > > > > * Most complex Gremlin traversals are verbose.
>> > Reducing
>> > > > the
>> > > > > > > amount
>> > > > > > > > > of
>> > > > > > > > > > > code
>> > > > > > > > > > > > that needs to be written and maintained allows for a
>> > > better
>> > > > > > > > developer
>> > > > > > > > > > > > experience.
>> > > > > > > > > > > > * It will lower the bar of entry for a developer by
>> > > making
>> > > > > > these
>> > > > > > > > > > actions
>> > > > > > > > > > > > more discoverable.  The more we can help bring these
>> > > > patterns
>> > > > > > to
>> > > > > > > > the
>> > > > > > > > > > > > forefront of the language via these pattern/meta
>> steps
>> > > the
>> > > > > more
>> > > > > > > we
>> > > > > > > > > > guide
>> > > > > > > > > > > > users towards writing better Gremlin faster
>> > > > > > > > > > > > * This allows DB vendors to optimize for this
>> pattern
>> > > > > > > > > > > >
>> > > > > > > > > > > > I know that this would likely be the first step in
>> > > Gremlin
>> > > > > that
>> > > > > > > > > > codifies
>> > > > > > > > > > > a
>> > > > > > > > > > > > pattern, so I'd like to get other's thoughts on
>> this?
>> > > > > > > > > > > >
>> > > > > > > > > > > > Dave
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: [DISCUSS] Creating pattern steps to codify best practices

Reply via email to