I agree with Josh's description of what the upsertV() functionality is intended to be.
I also fully support the simplification provided by the by() modulation that Stephen suggested, removing the second map. I think that provides a much cleaner and easier to comprehend syntax. With that agreement, I think this does beg the question of if this should be a new step (upsertV()) or just an additional signature on the addV() step? I Stephen alluded to this on the dev list and after thinking about this a bit I think that I am favor of not adding a step and just adding a new signature to the existing step (if possible). Thoughts? Dave On Wed, Dec 9, 2020 at 4:33 AM Stephen Mallette <[email protected]> wrote: > Josh, thanks for your thoughts - some responses inline: > > On Tue, Dec 8, 2020 at 10:16 PM Josh Perryman <[email protected]> > wrote: > > > I'll offer some thoughts. I'm seeing upsertV() as an idempotent > getOrCreate > > call which always returns a vertex with the label/property values > specified > > within the step. It's sort of a declarative pattern: "return this vertex > to > > me, find it if you can, create it if you must." > > > > I like this description - I've added it to the gist, though it's a bit at > odds with Dave's previous post, so we'll consider it a temporary addition > until he responds. > > > > On that account, I do like the simplification in 1. Repetition shouldn't > be > > necessary. In an ideal world, the engine should know the primary > > identifiers (name or id) and find/create the vertex based on them. Any > > other included values will be "trued up" as well. But this may be a > bridge > > too far for TinkerPop since knowing identifiers may require a specified > > schema. I'd prefer to omit the third input, but it might be necessary to > > keep it so that the second input can be for the matching use case. > > > > In my most recent post on gremlin-users I think I came up with a nice way > to get rid of the second Map. One Map that forms the full list of > properties for upserting is easier than partitioning two Maps that > essentially merge together. I imagine it's unlikely that application code > will have that separation naturally so users will have the added step of > trying to separate their data into searchable vs "just data". Getting us to > one Map argument will simplify APIs for us and reduce complexity to users. > Here is what I'd proposed for those not following over there: > > // match on name and age (or perhaps whatever the underlying graph system > thinks is best?) > g.upsertV('person', [name:'marko',age:29]) > > // match on name only > g.upsertV('person', [name:'marko',age:29]).by('name') > > // explicitly match on name and age > g.upsertV('person', [name:'marko',age:29]). > by('name').by('age') > > // match on id only > g.upsertV('person', [(T.id): 100, name:'marko',age:29]).by(T.id) > > // match on whatever the by(Traversal) predicate defines > g.upsertV('person', [name:'marko',age:29]). > by(has('name', 'marko')) > > // match on id, then update age > g.upsertV('person', [(T.id): 100, name:'marko']).by(T.id). > property('age',29) > > With this model, we get one Map argument that represents the complete > property set to be added/updated to the graph and the user can hint on what > key they wish to match on using by() where that sort of step modulation > should be a well understood and familiar concept in Gremlin at this point. > > So that means I think 2 should always match or update the additional > > values. Again, we're specifying the expected result and letting the > engine > > figure out best how to return that results and appropriately maintain > > state. > > > > I again like this description, but we'll see what Dave's thoughts are since > he's a bit behind on the threads at this point I think. > > > > I'm also presuming that anything not included as inputs to the upsertV() > > step are then to be handled by following steps. I'm hoping that is a > > sufficient approach for addressing the multi/meta property use cases > > brought up in 3. > > > > yeah................it needs more thought. I spent more time thinking on > this issue yesterday than I have for all the previous posts combined and I > think it yielded something good in that revised syntax. It's going to take > more of that kind of elbow grease to dig into these lesser use cases to > make sure we aren't coding ourselves into corners. > > > > I do like the idea of using modulators (with(), by()) for more > > sophisticated usage and advanced use cases. Also, the streaming examples > > are quite elegant allowing for a helpful separation of data and logic. > > > > cool - hope you like the revised syntax I posted then. :) > > > > That's my humble take. This is a very welcome addition to the language > and > > I appreciate the thoughtful & collaborative approach to the design > > considerations. > > > > Thanks again and please keep the thoughts coming. Lots of other interesting > design discussions seem to be brewing. > > > > > > Josh > > > > On Tue, Dec 8, 2020 at 8:57 AM Stephen Mallette <[email protected]> > > wrote: > > > > > I started a expanded this discussion to gremlin-users for a wider > > audience > > > and the thread is starting to grow: > > > > > > https://groups.google.com/g/gremlin-users/c/QBmiOUkA0iI/m/pj5Ukiq6AAAJ > > > > > > I guess we'll need to summarize that discussion back here now.... > > > > > > I did have some more thoughts to hang out there and figured that I > > wouldn't > > > convolute the discussion on gremlin-users with it so I will continue > the > > > discussion here. > > > > > > 1, The very first couple of examples seem wrong (or at least not best > > > demonstrating the usage): > > > > > > g.upsertV('person', [name: 'marko'], > > > [name: 'marko', age: 29]) > > > g.upsertV('person', [(T.id): 1], > > > [(T.id): 1, name: 'Marko']) > > > > > > should instead be: > > > > > > g.upsertV('person', [name: 'marko'], > > > [age: 29]) > > > g.upsertV('person', [(T.id): 1], > > > [name: 'Marko']) > > > > > > 2. I can't recall if we made this clear anywhere but in situations > where > > we > > > "get" rather than "create" do the additional properties act in an > update > > > fashion to the element that was found? I think I've been working on the > > > assumption that it would, though perhaps that is not always desirable? > > > > > > 3. We really never settled up how to deal with multi/meta-properties. > > That > > > story should be clear so that when we document upsert() we include the > > > approaches for the fallback patterns that don't meet the 90% of use > cases > > > we are targeting and I sense that we're saying that > meta/multi-properties > > > don't fit in that bucket. So with that in mind, I don't think that the > > > following works for metaproperties: > > > > > > g.upsertV('person', [(T.id): 1], > > > [name:[acl: 'public']) > > > > > > as it doesn't let us set the value for the "name" just the pairs for > the > > > meta-properties. I guess a user would have to fall back to: > > > > > > g.upsertV('person', [(T.id): 1]). > > > property('name','marko','acl','public') > > > > > > // or use the additional properties syntax > > > g.upsertV('person', [(T.id): 1],[name:'marko']). > > > properties('name').property('acl','public') > > > > > > // or if there were multi-properties then maybe... > > > g.upsertV('person', [(T.id): 1],[name:'marko']). > > > properties('name').hasValue('marko').property('acl','public') > > > > > > As for multi-properties, I dont think we should assume that a List > object > > > should be interpreted by Gremlin as a multi-property. Perhaps we just > > rely > > > on the underlying graph to properly deal with that given a schema or > the > > > user falls back to: > > > > > > // this ends up as however the graph deals with a List object > > > g.upsertV('person', [(T.id): 1], [lang: ['java', 'scala', 'java']) > > > > > > // this is explicit > > > g.upsertV('person', [(T.id): 1]). > > > property(list, 'lang', 'java'). > > > property(list, 'lang, 'scala'). > > > property(list, 'lang', 'java') > > > > > > If that makes sense to everyone, I will update the gist. > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Oct 27, 2020 at 11:51 PM David Bechberger <[email protected] > > > > > wrote: > > > > > > > Hello Stephen, > > > > > > > > Thanks for making that gist, its much easier to follow along with the > > > > proposed syntax there. To your specific comments: > > > > > > > > #1 - My only worry with the term upsert is that it is not as widely a > > > used > > > > term for this sort of pattern as "Merge" (e.g. SQL, Cypher). > However I > > > > don't have a strong opinion on this, so I am fine with either. > > > > #2 - My only real objective here is to make sure that we make the > > 80-90% > > > > case easy and straightforward. I think that having the fallback > option > > > of > > > > using the current syntax for any complicated edge cases should be > > > > considered here as well. I'd appreciate your thoughts here as these > are > > > > good points you bring up that definitely fall into the 80-90% use > case. > > > > #3 - Those points make sense to me, not sure I have anything further > to > > > add > > > > #4 - I don't think I would expect to have the values extracted from > the > > > > traversal filters but I'd be interested in other opinions on that. > > > > > > > > Thanks, > > > > Dave > > > > > > > > On Thu, Oct 15, 2020 at 5:30 AM Stephen Mallette < > [email protected] > > > > > > > wrote: > > > > > > > > > It's been a couple weeks since we moved this thread so I went back > > > > through > > > > > it and refreshed my mind with what's there. Dave, nice job putting > > all > > > > > those examples together. I've recollected them all together and > have > > > just > > > > > been reviewing them in a more formatted style with proper Groovy > > syntax > > > > > than what can be accomplished in this mailing list. Anyone who > cares > > to > > > > > review in that form can do so here (i've pasted its markdown > contents > > > at > > > > > the bottom of this post as well): > > > > > > > > > > > https://gist.github.com/spmallette/5cd448f38d5dae832c67d890b576df31 > > > > > > > > > > In this light, I have the following comments and thoughts: > > > > > > > > > > 1. I feel more inclined toward saving "merge" for a more > generalized > > > step > > > > > of that name and have thus renamed the examples to use "upsert". > > > > > > > > > > 2. I still don't quite feel comfortable with meta/multi-properties > > > > > examples. Perhaps you could explain further as it's possible I'm > not > > > > > thinking things through properly. For meta-properties the example > is: > > > > > > > > > > g.upsertV('person', [(T.id): 1], > > > > > [(T.id): 1, name:[first: Marko', last: 'R']) > > > > > > > > > > So I see that "name" takes a Map here, but how does Gremlin know if > > > that > > > > > means "meta-property" or "Map value" and if the former, then how do > > you > > > > set > > > > > the value of "name"? My question is similar to the multi-property > > > > examples > > > > > - using this one: > > > > > > > > > > g.upsertV('person', [(T.id): 1], > > > > > [(T.id): 1,lang: ['java', 'scala']) > > > > > > > > > > How does Gremlin know if the user means that "lang" should be a > > > > > "multi-property" with Cardinality.list (or Cardinality.set for that > > > > matter) > > > > > or a value of Cardinality.single with a List/Array value? I can > > perhaps > > > > > suggest some solutions here but wanted to make sure I wasn't just > > > > > misunderstanding something. > > > > > > > > > > 3. The "upsert with stream" example is: > > > > > > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'], > > > > > [(id): 2, (label): 'person', name: 'josh']). > > > > > upsertV(values(label), valueMap(id), valueMap()) > > > > > > > > > > That would establish an API of upsertV(Traversal, Traversal, > > Traversal) > > > > > which would enable your final thought of your last post with: > > > > > > > > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state', > > > 'NY'), > > > > > [active: true]) > > > > > > > > > > The open-ended nature of that is neat but comes with some > complexity > > > with > > > > > steps that traditionally take an "open" Traversal (and this step > now > > > has > > > > up > > > > > to three of them) - perhaps, that's our own fault in some ways. For > > the > > > > > "stream" concept, values()/valueMap() won't immediately work that > way > > > as > > > > > they only apply to Element...given today's Gremlin semantics, I > think > > > > we'd > > > > > re-write that as: > > > > > > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'], > > > > > [(id): 2, (label): 'person', name: 'josh']). > > > > > upsertV(select(label), select(id), select(id,label,'name')) > > > > > > > > > > Though, even that doesn't quite work because you can't select(T) > > right > > > > now > > > > > which means you'd need to go with: > > > > > > > > > > g.inject([id: 1, label: 'person', name: 'marko'], > > > > > [id: 2, label: 'person', name: 'josh']). > > > > > upsertV(select('label'), select('id'), > select('id','label','name')) > > > > > > > > > > not terrible in a way and perhaps we could fix select(T) - can't > > > remember > > > > > why that isn't allowed except that select() operates over multiple > > > scopes > > > > > and perhaps trying T through those scopes causes trouble. > > > > > > > > > > 4. Continuing with the use of Traversal as arguments, let's go back > > to: > > > > > > > > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state', > > > 'NY'), > > > > > [active: true]) > > > > > > > > > > I originally had it in my mind to prefer this approach over a Map > for > > > the > > > > > search as it provides a great deal of flexibility and fits the > common > > > > > filtering patterns that users follow really well. I suppose the > > > question > > > > is > > > > > what to do in the situation of "create". Would we extract all the > > > strict > > > > > equality filter values from the initial has(), thus, in the > example, > > > > > "state" but not "age", and create a vertex that has "state=NY, > > > > > active=true"? That is possible with how things are designed in > > Gremlin > > > > > today I think. > > > > > > > > > > This syntax does create some conflict with the subject of 3 and > > streams > > > > > because the traverser in the streams case is the incoming Map but > for > > > > this > > > > > context it's meant as a filter of V(). Getting too whacky? > > > > > > > > > > 5. I like the idea of using a side-effect to capture whether the > > > element > > > > > was created or not. that makes sense. I think that can work. > > > > > > > > > > ============================== > > > > > = gist contents > > > > > ============================== > > > > > > > > > > # API > > > > > > > > > > ```java > > > > > upsertV(String label, Map matchOrCreateProperties) > > > > > upsertV(String label, Map matchOrCreateProperties, Map > > > > > additionalProperties) > > > > > upsertV(Traversal label, Traversal matchOrCreateProperties) > > > > > upsertV(Traversal label, Traversal matchOrCreateProperties, > Traversal > > > > > additionalProperties) > > > > > upsertV(...). > > > > > with(WithOptions.sideEffectLabel, 'a') > > > > > > > > > > upsertE(String label, Map matchOrCreateProperties) > > > > > upsertE(Traversal label, Traversal matchOrCreateProperties) > > > > > upsertE(String label, Map matchOrCreateProperties, Map > > > > > additionalProperties) > > > > > upsertE(Traversal label, Traversal matchOrCreateProperties, > Traversal > > > > > additionalProperties) > > > > > upsertE(...). > > > > > from(Vertex incidentOut).to(Vertex incidentIn) > > > > > with(WithOptions.sideEffectLabel, 'a') > > > > > ``` > > > > > > > > > > # Examples > > > > > > > > > > ## upsert > > > > > > > > > > ```groovy > > > > > g.upsertV('person', [name: 'marko'], > > > > > [name: 'marko', age: 29]) > > > > > ``` > > > > > > > > > > ## upsert with id > > > > > > > > > > ```groovy > > > > > g.upsertV('person', [(T.id): 1], > > > > > [(T.id): 1, name: 'Marko']) > > > > > ``` > > > > > > > > > > ## upsert with Meta Properties > > > > > > > > > > ```groovy > > > > > g.upsertV('person', [(T.id): 1], > > > > > [(T.id): 1, name:[first: Marko', last: 'R']) > > > > > ``` > > > > > > > > > > ## upsert with Multi Properties > > > > > > > > > > ```groovy > > > > > g.upsertV('person', [(T.id): 1], > > > > > [(T.id): 1,lang: ['java', 'scala']) > > > > > ``` > > > > > > > > > > ## upsert with Single Cardinality > > > > > > > > > > ```groovy > > > > > g.upsertV('person', [(T.id): 1], > > > > > [(T.id): 1, lang: 'java') > > > > > ``` > > > > > > > > > > ## upsert with List Cardinality > > > > > > > > > > ```groovy > > > > > g.upsertV('person', [(T.id): 1], > > > > > [(T.id): 1, lang: ['java', 'scala', 'java']) > > > > > ``` > > > > > > > > > > ## upsert with Set Cardinality > > > > > > > > > > ```groovy > > > > > g.upsertV('person', [(T.id): 1], > > > > > [(T.id): 1, lang: ['java','scala']) > > > > > ``` > > > > > > > > > > ## upsert with List Cardinality - and add value to list > > > > > > > > > > ```groovy > > > > > g.upsertV('person', [(T.id): 1], > > > > > [(T.id): 1, lang: ['java','scala','java']). > > > > > property(Cardinality.list, 'lang', 'java') > > > > > ``` > > > > > > > > > > ## upsert with stream > > > > > > > > > > ```groovy > > > > > // doesn't work with today's Gremlin semantics > > > > > g.inject([(id): 1, (label): 'person', name: 'marko'], > > > > > [(id): 2, (label): 'person', name: 'josh']). > > > > > upsertV(values(label), valueMap(id), valueMap()) > > > > > > > > > > // the following would be more in line with what's possible with > > > existing > > > > > Gremlin semantics: > > > > > g.inject([id: 1, label: 'person', name: 'marko'], > > > > > [id: 2, label: 'person', name: 'josh']). > > > > > upsertV(select('label'), select('id'), > select('id','label','name')) > > > > > ``` > > > > > > > > > > ## upsert with reporting added or updated side effect > > > > > > > > > > ```groovy > > > > > g.upsertV('person', [name: 'marko'], > > > > > [name: 'marko', age: 29]). > > > > > with(WithOptions.sideEffectLabel, 'a'). > > > > > project('vertex', 'added'). > > > > > by(). > > > > > by(cap('a')) > > > > > ``` > > > > > > > > > > ## upsert Edge assuming a self-relation of "marko" > > > > > > > > > > ```groovy > > > > > g.V().has('person','name','marko'). > > > > > upsertE( 'self', [weight:0.5]) > > > > > ``` > > > > > > > > > > ## upsert Edge with incident vertex checks > > > > > > > > > > ```groovy > > > > > g.upsertE('knows', [weight:0.5]). > > > > > from(V(1)).to(V(2)) > > > > > g.V().has('person','name','marko'). > > > > > upsertE( 'knows', [weight:0.5]). > > > > > to(V(2)) > > > > > ``` > > > > > > > > > > ## upsert using a Traversal for the match > > > > > > > > > > ```groovy > > > > > g.V().upsertV('person', __.has('person','age',gt(29)).has('state', > > > 'NY'), > > > > > [active: true]) > > > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 28, 2020 at 6:49 PM David Bechberger < > > [email protected]> > > > > > wrote: > > > > > > > > > > > So, I've been doing some additional thinking about the ways that > > this > > > > > could > > > > > > work based on the comments below and have put my comments inline. > > > > > > > > > > > > Dave > > > > > > > > > > > > On Tue, Sep 22, 2020 at 6:05 AM Stephen Mallette < > > > [email protected] > > > > > > > > > > > wrote: > > > > > > > > > > > > > I added some thoughts inline below: > > > > > > > > > > > > > > On Fri, Sep 18, 2020 at 3:51 PM David Bechberger < > > > > [email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > Thanks for the detailed comments Stephen. I have addressed > > them > > > > > inline > > > > > > > > below. > > > > > > > > > > > > > > > > I did read the proposal from earlier and I think that we are > in > > > > close > > > > > > > > agreement with what we are trying to accomplish. I also > fully > > > > > support > > > > > > > > Josh's comment on providing a mechanism for submitting a map > of > > > > > > > properties > > > > > > > > as manually unrolling this all right now leads to a lot of > > > > potential > > > > > > for > > > > > > > > error and a long messy traversal. > > > > > > > > > > > > > > > > I'm looking forward to this discussion on how to merge these > > two > > > > > > > proposals. > > > > > > > > > > > > > > > > > > > > > > > > 1. How would multi/meta-properties fit into the API you've > > > > proposed? > > > > > > > > > > > > > > > > My first thought here is that multi-properties would be > > > represented > > > > > as > > > > > > > > lists in the map, e.g. > > > > > > > > > > > > > > > > {names: ['Dave', 'David']} > > > > > > > > > > > > > > > > and meta-properties would be represented as maps in the maps, > > > e.g. > > > > > > > > > > > > > > > > {name: {first: 'Dave', last: 'Bechberger'}} > > > > > > > > > > > > > > > > I can't say I've thought through all the implications of this > > > > though > > > > > so > > > > > > > it > > > > > > > > is an area we would need to explore. > > > > > > > > > > > > > > > > > > > > > > The first implication that comes to mind is that it makes the > > > > > assumption > > > > > > > the user wants multi/meta-properties as opposed to a single > > > > cardinality > > > > > > of > > > > > > > List in the first case and a Map as a property value in the > > second > > > > > case. > > > > > > I > > > > > > > suppose that graphs with a schema could resolve those > assumptions > > > but > > > > > > > graphs that are schemaless would have a problem. The issue > could > > be > > > > > > > resolved by specialized configuration of "g" or per merge() > step > > > > using > > > > > a > > > > > > > with() modulator I suppose but that goes into a yet another > level > > > of > > > > > > > implications to consider. I've often wondered if the start > point > > > for > > > > > > > getting types/schema into TP3 without a full rewrite would be > in > > > this > > > > > > form > > > > > > > where Gremlin would be given hints as to what to expect as to > the > > > > types > > > > > > of > > > > > > > data it might encounter while traversing. Anyway, I'd be > hesitant > > > to > > > > go > > > > > > > down paths that don't account for multi/metaproperties well. > > They > > > > are > > > > > > > first class citizens in TP3 (with those hoping for extension of > > at > > > > > least > > > > > > > multiproperties to edges) and while I find them a constant > > > annoyance > > > > > for > > > > > > so > > > > > > > many reasons, we're kinda stuck with them. > > > > > > > > > > > > > > > > > > > I agree we need to account for multi/meta properties as 1st class > > > > > > citizens. This is my current thinking on the syntax for the > > > situations > > > > > we > > > > > > have laid out so far: > > > > > > > > > > > > *Merge* > > > > > > g.mergeV('name', {'name': 'marko'}, {'name': 'marko', 'age': 29}) > > > > > > > > > > > > > > > > > > *Merge with id* g.mergeV('name', {T.id: 1}, {T.id: 1, 'name': > > > 'Marko'}) > > > > > > > > > > > > > > > > > > *Merge with Meta Properties* g.mergeV('name', {T.id: 1}, {T.id: > 1, > > > > > 'name': > > > > > > {'first': Marko', 'last': 'R'}) > > > > > > > > > > > > > > > > > > *Merge with Multi Properties * g.mergeV('name', {T.id: 1}, {T.id: > > 1, > > > > > > 'lang': ['java', 'scala']) > > > > > > > > > > > > > > > > > > *Merge with Single Cardinality* g.mergeV('name', {T.id: 1}, > {T.id: > > 1, > > > > > > 'lang': 'java') > > > > > > > > > > > > > > > > > > *Merge with List Cardinality* g.mergeV('name', {T.id: 1}, {T.id: > 1, > > > > > 'lang': > > > > > > ['java', 'scala', 'java']) > > > > > > > > > > > > > > > > > > *Merge with Set Cardinality * g.mergeV('name', {T.id: 1}, {T.id: > 1, > > > > > 'lang': > > > > > > ['java', 'scala']) > > > > > > > > > > > > Since in a mergeV() scenario we are only ever adding whatever > > values > > > > are > > > > > > passed in there would be no need to specify the cardinality of > the > > > > > property > > > > > > being added. If they wanted to add a value to an existing > property > > > > then > > > > > > the current property() method would still be available on the > > output > > > > > > traversal. e.g. > > > > > > g.mergeV('name', {T.id: 1}, {T.id: 1, 'lang': ['java', 'scala', > > > > > > 'java']).property(Cardinality.list, 'lang, 'java') > > > > > > > > > > > > > > > > > > *Merge with stream * g.inject([{id: 1, label: 'person', 'name': > > > > 'marko'}, > > > > > > {id: 2, label: 'person', 'name': 'josh'}]). > > > > > > mergeV(values('label'), valueMap('id'), valueMap()) > > > > > > > > > > > > > > > > > > *Merge with reporting added or updated side effect* > > g.mergeV('name', > > > > > > {'name': 'marko'}, {'name': 'marko', 'age': 29}). > > > > > > with(WithOptions.sideEffectLabel, 'a'). > > > > > > project('vertex', 'added'). > > > > > > by(). > > > > > > by(cap('a')) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. How would users set the T.id on creation? would that T.id > > just > > > > be > > > > > a > > > > > > > key > > > > > > > > in the first Map argument? > > > > > > > > > > > > > > > > Yes, I was thinking that T.id would be the key name, e.g.: > > > > > > > > > > > > > > > > g.mergeV('name', {T.id: 1}, {'name': 'Marko'}) > > > > > > > > > > > > > > > > > > > > > > ok - I can't say I see a problem with that atm. > > > > > > > > > > > > > > > > > > > > > > 3. I do like the general idea of a match on multiple > properties > > > for > > > > > the > > > > > > > > first argument as a convenience but wonder about the > > specificity > > > of > > > > > > this > > > > > > > > API a bit as it focuses heavily on equality - I suppose > that's > > > most > > > > > > cases > > > > > > > > for get-or-create, so perhaps that's ok. > > > > > > > > > > > > > > > > In most cases I've seen use exact matches on the vertex or > > > edge. I > > > > > > think > > > > > > > > it might be best to keep this straightforward as any complex > > edge > > > > > cases > > > > > > > > still can perform the same functionality using the coalesce() > > > > > pattern. > > > > > > > > > > > > > > > > > > > > > > I played around with the idea to use modulators to apply > > additional > > > > > > > constraints: > > > > > > > > > > > > > > g.mergeV('person', {'state':'NY'}, {'active': true}). > > > > > > > by(has('age', gt(29)) > > > > > > > > > > > > > > I suppose that's neat but maybe not...we could easily get the > > same > > > > > thing > > > > > > > from: > > > > > > > > > > > > > > g.V().has('person','age',gt(29)). > > > > > > > mergeV('person', {'state':'NY'}, {'active': true}). > > > > > > > > > > > > > > which is more readable though it does make it harder for > > providers > > > to > > > > > > > optimize upsert operations if they could make use of the has() > as > > > > part > > > > > of > > > > > > > that. I think I like has() as part of the main query rather > than > > > in a > > > > > > by() > > > > > > > - just thinking out loud. > > > > > > > > > > > > > > > > > > > > Another potential option here would be to allow the second > > parameter > > > of > > > > > > mergeV() accept a traversal. Not sure how much complexity that > > would > > > > add > > > > > > though > > > > > > > > > > > > g.V().mergeV('person', __.has('person','age',gt(29)).has('state', > > > > 'NY'}., > > > > > > {'active': true}). > > > > > > > > > > > > > > > > > > > > > > > > > > > 4. I think your suggestion points to one of the troubles > > Gremlin > > > > has > > > > > > > which > > > > > > > > we see with "algorithms" - extending the language with new > > steps > > > > that > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up with > > > > > > > > shortestPath() step) pollutes the core language a bit, hence > my > > > > > > > > generalization of "merging" in my link above which fits into > > the > > > > core > > > > > > > > Gremlin language style. There is a bigger picture where we > are > > > > > missing > > > > > > > > something in Gremlin that lets us extend the language in ways > > > that > > > > > let > > > > > > us > > > > > > > > easily introduce new steps that aren't for general purpose. > > This > > > > > issue > > > > > > is > > > > > > > > discussed in terms of "algorithms" here: > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I > > think > > > > we > > > > > > > could > > > > > > > > see how there might be some "mutation" extension steps that > > would > > > > > cover > > > > > > > > your suggested API, plus batch operations, etc. We need a way > > to > > > > add > > > > > > > > "sugar" without it interfering with the consistency of the > > core. > > > > > > > Obviously > > > > > > > > this is a bigger issue but perhaps important to solve to > > > implement > > > > > > steps > > > > > > > in > > > > > > > > the fashion you describe. > > > > > > > > > > > > > > > > I agree that the "algorithm" steps do seem a bit odd in the > > core > > > > > > language > > > > > > > > but I think the need is there. I'd be interested in > furthering > > > > this > > > > > > > > discussion but I think these "pattern" steps may or may not > be > > > the > > > > > same > > > > > > > as > > > > > > > > the algorithm steps. I'll have to think on that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to > > specify > > > > what > > > > > > > > element type the first Map argument should be applied to? > what > > > > about > > > > > > > > mergeVP (i.e. vertex property as it too is an element) ? > That's > > > > > tricky > > > > > > > but > > > > > > > > I don't think we should miss that. Perhaps merge() could be a > > > > > "complex > > > > > > > > modulator"?? that's a new concept of course, but you would do > > > > > > > g.V().merge() > > > > > > > > and the label and first Map would fold to VertexStartStep > (i.e. > > > > V()) > > > > > > for > > > > > > > > the lookup and then a MergeStep would follow - thus a > "complex > > > > > > modulator" > > > > > > > > as it does more than just change the behavior of the previous > > > step > > > > - > > > > > it > > > > > > > > also adds its own. I suppose it could also add has() steps > > > followed > > > > > by > > > > > > > the > > > > > > > > MergeStep and then the has() operations would fold in > normally > > as > > > > > they > > > > > > do > > > > > > > > today. In this way, we can simplify to just one single > > > > > > > > merge(String,Map,Map). ?? > > > > > > > > > > > > > > > > I agree that we should also think about how to include > > properties > > > > in > > > > > > this > > > > > > > > merge construct. The reason I was thinking about mergeV() > and > > > > > mergeE() > > > > > > > is > > > > > > > > that it follows the same pattern as the already well > understood > > > > > > > > addV()/addE() steps. I am a bit wary of trying to generalize > > > this > > > > > down > > > > > > > to > > > > > > > > a single merge() step as these sorts of complex overloads > make > > it > > > > > hard > > > > > > to > > > > > > > > figure out which Gremlin step you should use for a particular > > > > pattern > > > > > > > (e.g. > > > > > > > > upsert vertex or upsert edge). One thing that I think > > customers > > > > find > > > > > > > > useful about the addV/addE step is that they are very > > > discoverable, > > > > > the > > > > > > > > name tells me what functionality to expect from that step. > > > > > > > > > > > > > > > > > > > > > > I'm starting to agree with the idea that we have to do > something > > > like > > > > > > > mergeV()/E() as it wouldn't quite work as a start step > otherwise. > > > We > > > > > > > couldn't do: > > > > > > > > > > > > > > g.merge() > > > > > > > > > > > > > > as we wouldnt know what sort of Element it applied to. If so, I > > > > wonder > > > > > if > > > > > > > it would be better to preserve "merge" for my more general > > use-case > > > > and > > > > > > > prefer upsertV()/E()? > > > > > > > > > > > > > > Also, perhaps mergeVP() doesn't need to exist as perhaps it is > an > > > > > > uncommon > > > > > > > case (compared to V/E()) and we don't really have addVP() as an > > > > > analogous > > > > > > > step. Perhaps existing coalesce() patterns and/or my more > general > > > > > purpose > > > > > > > merge() step would satisfy those situations?? > > > > > > > > > > > > > > > > > > > I think that a mergeVP() type step may require a different > pattern > > > > since > > > > > it > > > > > > would really need to accept a map of values otherwise it would > > > > > essentially > > > > > > perform the same function as property() > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 6. One thing neither my approach nor yours seems to do is > tell > > > the > > > > > user > > > > > > > if > > > > > > > > they created something or updated something - that's another > > > thing > > > > > I've > > > > > > > > seen users want to have in get-or-create. Here again we go > > deeper > > > > > into > > > > > > a > > > > > > > > less general step specification as alluded to in 4, but a > > merge() > > > > > step > > > > > > as > > > > > > > > proposed in 5, might return [Element,boolean] so as to > provide > > an > > > > > > > indicator > > > > > > > > of creation? > > > > > > > > > > > > > > > > Hmm, I'll have to think about that. How would returning > > multiple > > > > > > values > > > > > > > > work if we want to chain these together. e.g. Add a vertex > > > between > > > > > two > > > > > > > > edges but make sure the vertices exist? > > > > > > > > > > > > > > > > > > > > > > > > > > > Added a thought on how to accomplish this above that I'd like to > > get > > > > > > thoughts on. > > > > > > > > > > > > > > > > > > > > > > > > > > > > 7. You were just introducing your ideas here, so perhaps you > > > > haven't > > > > > > > gotten > > > > > > > > this far yet, but a shortcoming to doing > merge(String,Map,Map) > > is > > > > > that > > > > > > it > > > > > > > > leaves open no opportunity to stream a List of Maps to a > > merge() > > > > for > > > > > a > > > > > > > form > > > > > > > > of batch loading which is mighty common and one of the > > variations > > > > of > > > > > > the > > > > > > > > coalesce() pattern that I alluded to at the start of all > this. > > I > > > > > think > > > > > > > that > > > > > > > > we would want to be sure that we left open the option to do > > that > > > > > > somehow. > > > > > > > > 8. If we had a general purpose merge() step I wonder if it > > makes > > > > > > > developing > > > > > > > > the API as you suggested easier to do? > > > > > > > > > > > > > > > > Hmm, let me think about that one. > > > > > > > > > > > > > > > > > > > > > > > I will add an item 8 to think about which I didn't mention > > before: > > > > > > > > > > > > > > 8. The signature you suggested for mergeE() is: > > > > > > > > > > > > > > mergeE(String, Map, Map) > > > > > > > String - The edge label > > > > > > > Map (first) - The properties to match existing edge on > > > > > > > Map (second) - Any additional properties to set if a new > > edge > > > is > > > > > > > created (optional) > > > > > > > > > > > > > > but does that exactly answer the need? Typically the idea is to > > > > detect > > > > > an > > > > > > > edge between two vertices not globally in the graph by way of > > some > > > > > > > properties. This signature doesn't seem to allow for that as it > > > > doesn't > > > > > > > allow specification of the vertices to test against. Perhaps > the > > > > answer > > > > > > is > > > > > > > to use from() and to() modulators? > > > > > > > > > > > > > > g.mergeE('knows', {'weight':0.5}). > > > > > > > from(V(1)).to(V(2)) > > > > > > > g.V().has('person','name','marko'). > > > > > > > mergeE( 'knows', {'weight':0.5}). > > > > > > > to(V(2)) > > > > > > > g.V().has('person','name','marko'). > > > > > > > mergeE( 'self', {'weight':0.5}) > > > > > > > > > > > > > > That seems to work? > > > > > > > > > > > > > > > > > > > I think that is an elegant solution to that problem. I like it > and > > > it > > > > > > keeps in line with the way that addE() works. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Sep 17, 2020 at 4:31 AM Stephen Mallette < > > > > > [email protected] > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > I like our coalesce() pattern but it is verbose and over > time > > > it > > > > > has > > > > > > > gone > > > > > > > > > from a simple pattern to one with numerous variations for > all > > > > > manner > > > > > > of > > > > > > > > > different sorts of merge-like operations. As such, I do > think > > > we > > > > > > should > > > > > > > > > introduce something to cover this pattern. > > > > > > > > > > > > > > > > > > I like that you used the word "merge" in your description > of > > > this > > > > > as > > > > > > it > > > > > > > > is > > > > > > > > > the word I've liked using. You might want to give a look at > > my > > > > > > proposed > > > > > > > > > merge() step from earlier in the year: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E > > > > > > > > > > > > > > > > > > I'm just going to dump thoughts as they come regarding what > > you > > > > > > wrote: > > > > > > > > > > > > > > > > > > 1. How would multi/meta-properties fit into the API you've > > > > > proposed? > > > > > > > > > 2. How would users set the T.id on creation? would that > T.id > > > just > > > > > be > > > > > > a > > > > > > > > key > > > > > > > > > in the first Map argument? > > > > > > > > > 3. I do like the general idea of a match on multiple > > properties > > > > for > > > > > > the > > > > > > > > > first argument as a convenience but wonder about the > > > specificity > > > > of > > > > > > > this > > > > > > > > > API a bit as it focuses heavily on equality - I suppose > > that's > > > > most > > > > > > > cases > > > > > > > > > for get-or-create, so perhaps that's ok. > > > > > > > > > 4. I think your suggestion points to one of the troubles > > > Gremlin > > > > > has > > > > > > > > which > > > > > > > > > we see with "algorithms" - extending the language with new > > > steps > > > > > that > > > > > > > > > provides a form of "sugar" (e.g. in algorithms we end up > with > > > > > > > > > shortestPath() step) pollutes the core language a bit, > hence > > my > > > > > > > > > generalization of "merging" in my link above which fits > into > > > the > > > > > core > > > > > > > > > Gremlin language style. There is a bigger picture where we > > are > > > > > > missing > > > > > > > > > something in Gremlin that lets us extend the language in > ways > > > > that > > > > > > let > > > > > > > us > > > > > > > > > easily introduce new steps that aren't for general purpose. > > > This > > > > > > issue > > > > > > > is > > > > > > > > > discussed in terms of "algorithms" here: > > > > > > > > > https://issues.apache.org/jira/browse/TINKERPOP-1991 but I > > > think > > > > > we > > > > > > > > could > > > > > > > > > see how there might be some "mutation" extension steps that > > > would > > > > > > cover > > > > > > > > > your suggested API, plus batch operations, etc. We need a > way > > > to > > > > > add > > > > > > > > > "sugar" without it interfering with the consistency of the > > > core. > > > > > > > > Obviously > > > > > > > > > this is a bigger issue but perhaps important to solve to > > > > implement > > > > > > > steps > > > > > > > > in > > > > > > > > > the fashion you describe. > > > > > > > > > 5. I suppose that the reason for mergeE and mergeV is to > > > specify > > > > > what > > > > > > > > > element type the first Map argument should be applied to? > > what > > > > > about > > > > > > > > > mergeVP (i.e. vertex property as it too is an element) ? > > That's > > > > > > tricky > > > > > > > > but > > > > > > > > > I don't think we should miss that. Perhaps merge() could > be a > > > > > > "complex > > > > > > > > > modulator"?? that's a new concept of course, but you would > do > > > > > > > > g.V().merge() > > > > > > > > > and the label and first Map would fold to VertexStartStep > > (i.e. > > > > > V()) > > > > > > > for > > > > > > > > > the lookup and then a MergeStep would follow - thus a > > "complex > > > > > > > modulator" > > > > > > > > > as it does more than just change the behavior of the > previous > > > > step > > > > > - > > > > > > it > > > > > > > > > also adds its own. I suppose it could also add has() steps > > > > followed > > > > > > by > > > > > > > > the > > > > > > > > > MergeStep and then the has() operations would fold in > > normally > > > as > > > > > > they > > > > > > > do > > > > > > > > > today. In this way, we can simplify to just one single > > > > > > > > > merge(String,Map,Map). ?? > > > > > > > > > 6. One thing neither my approach nor yours seems to do is > > tell > > > > the > > > > > > user > > > > > > > > if > > > > > > > > > they created something or updated something - that's > another > > > > thing > > > > > > I've > > > > > > > > > seen users want to have in get-or-create. Here again we go > > > deeper > > > > > > into > > > > > > > a > > > > > > > > > less general step specification as alluded to in 4, but a > > > merge() > > > > > > step > > > > > > > as > > > > > > > > > proposed in 5, might return [Element,boolean] so as to > > provide > > > an > > > > > > > > indicator > > > > > > > > > of creation? > > > > > > > > > 7. You were just introducing your ideas here, so perhaps > you > > > > > haven't > > > > > > > > gotten > > > > > > > > > this far yet, but a shortcoming to doing > > merge(String,Map,Map) > > > is > > > > > > that > > > > > > > it > > > > > > > > > leaves open no opportunity to stream a List of Maps to a > > > merge() > > > > > for > > > > > > a > > > > > > > > form > > > > > > > > > of batch loading which is mighty common and one of the > > > variations > > > > > of > > > > > > > the > > > > > > > > > coalesce() pattern that I alluded to at the start of all > > this. > > > I > > > > > > think > > > > > > > > that > > > > > > > > > we would want to be sure that we left open the option to do > > > that > > > > > > > somehow. > > > > > > > > > 8. If we had a general purpose merge() step I wonder if it > > > makes > > > > > > > > developing > > > > > > > > > the API as you suggested easier to do? > > > > > > > > > > > > > > > > > > I think I'd like to solve the problems you describe in your > > > post > > > > as > > > > > > > well > > > > > > > > as > > > > > > > > > the ones in mine. There is some relation there, but gaps as > > > well. > > > > > > With > > > > > > > > more > > > > > > > > > discussion here we can figure something out. > > > > > > > > > > > > > > > > > > Thanks for starting this talk - good one! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Sep 16, 2020 at 9:26 PM David Bechberger < > > > > > > [email protected]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > I've had a few on and off discussions with a few people > > here, > > > > so > > > > > I > > > > > > > > wanted > > > > > > > > > > to send this out to everyone for feedback. > > > > > > > > > > > > > > > > > > > > What are people's thoughts on creating a new set of steps > > > that > > > > > > codify > > > > > > > > > > common Gremlin best practices? > > > > > > > > > > > > > > > > > > > > I think there are several common Gremlin patterns where > > users > > > > > would > > > > > > > > > benefit > > > > > > > > > > from the additional guidance that these codified steps > > > > represent. > > > > > > > The > > > > > > > > > > first one I would recommend though is codifying the > element > > > > > > existence > > > > > > > > > > pattern into a single Gremlin step, something like: > > > > > > > > > > > > > > > > > > > > mergeV(String, Map, Map) > > > > > > > > > > String - The vertex label > > > > > > > > > > Map (first) - The properties to match existing > > vertices > > > on > > > > > > > > > > Map (second) - Any additional properties to set if a > > new > > > > > > vertex > > > > > > > is > > > > > > > > > > created (optional) > > > > > > > > > > mergeE(String, Map, Map) > > > > > > > > > > String - The edge label > > > > > > > > > > Map (first) - The properties to match existing edge > on > > > > > > > > > > Map (second) - Any additional properties to set if a > > new > > > > > edge > > > > > > is > > > > > > > > > > created (optional) > > > > > > > > > > > > > > > > > > > > In each of these cases these steps would perform the same > > > > upsert > > > > > > > > > > functionality as the element existence pattern. > > > > > > > > > > > > > > > > > > > > Example: > > > > > > > > > > > > > > > > > > > > g.V().has('person','name','stephen'). > > > > > > > > > > fold(). > > > > > > > > > > coalesce(unfold(), > > > > > > > > > > addV('person'). > > > > > > > > > > property('name','stephen'). > > > > > > > > > > property('age',34)) > > > > > > > > > > > > > > > > > > > > would become: > > > > > > > > > > > > > > > > > > > > g.mergeV('person', {'name': 'stephen'}, {'age', 34}) > > > > > > > > > > > > > > > > > > > > I think that this change would be a good addition to the > > > > language > > > > > > for > > > > > > > > > > several reasons: > > > > > > > > > > > > > > > > > > > > * This codifies the best practice for a specific > > > action/recipe, > > > > > > which > > > > > > > > > > reduces the chance that someone uses the pattern > > incorrectly > > > > > > > > > > * Most complex Gremlin traversals are verbose. Reducing > > the > > > > > amount > > > > > > > of > > > > > > > > > code > > > > > > > > > > that needs to be written and maintained allows for a > better > > > > > > developer > > > > > > > > > > experience. > > > > > > > > > > * It will lower the bar of entry for a developer by > making > > > > these > > > > > > > > actions > > > > > > > > > > more discoverable. The more we can help bring these > > patterns > > > > to > > > > > > the > > > > > > > > > > forefront of the language via these pattern/meta steps > the > > > more > > > > > we > > > > > > > > guide > > > > > > > > > > users towards writing better Gremlin faster > > > > > > > > > > * This allows DB vendors to optimize for this pattern > > > > > > > > > > > > > > > > > > > > I know that this would likely be the first step in > Gremlin > > > that > > > > > > > > codifies > > > > > > > > > a > > > > > > > > > > pattern, so I'd like to get other's thoughts on this? > > > > > > > > > > > > > > > > > > > > Dave > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
