I like our coalesce() pattern but it is verbose and over time it has gone from a simple pattern to one with numerous variations for all manner of different sorts of merge-like operations. As such, I do think we should introduce something to cover this pattern.
I like that you used the word "merge" in your description of this as it is the word I've liked using. You might want to give a look at my proposed merge() step from earlier in the year: https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E I'm just going to dump thoughts as they come regarding what you wrote: 1. How would multi/meta-properties fit into the API you've proposed? 2. How would users set the T.id on creation? would that T.id just be a key in the first Map argument? 3. I do like the general idea of a match on multiple properties for the first argument as a convenience but wonder about the specificity of this API a bit as it focuses heavily on equality - I suppose that's most cases for get-or-create, so perhaps that's ok. 4. I think your suggestion points to one of the troubles Gremlin has which we see with "algorithms" - extending the language with new steps that provides a form of "sugar" (e.g. in algorithms we end up with shortestPath() step) pollutes the core language a bit, hence my generalization of "merging" in my link above which fits into the core Gremlin language style. There is a bigger picture where we are missing something in Gremlin that lets us extend the language in ways that let us easily introduce new steps that aren't for general purpose. This issue is discussed in terms of "algorithms" here: https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we could see how there might be some "mutation" extension steps that would cover your suggested API, plus batch operations, etc. We need a way to add "sugar" without it interfering with the consistency of the core. Obviously this is a bigger issue but perhaps important to solve to implement steps in the fashion you describe. 5. I suppose that the reason for mergeE and mergeV is to specify what element type the first Map argument should be applied to? what about mergeVP (i.e. vertex property as it too is an element) ? That's tricky but I don't think we should miss that. Perhaps merge() could be a "complex modulator"?? that's a new concept of course, but you would do g.V().merge() and the label and first Map would fold to VertexStartStep (i.e. V()) for the lookup and then a MergeStep would follow - thus a "complex modulator" as it does more than just change the behavior of the previous step - it also adds its own. I suppose it could also add has() steps followed by the MergeStep and then the has() operations would fold in normally as they do today. In this way, we can simplify to just one single merge(String,Map,Map). ?? 6. One thing neither my approach nor yours seems to do is tell the user if they created something or updated something - that's another thing I've seen users want to have in get-or-create. Here again we go deeper into a less general step specification as alluded to in 4, but a merge() step as proposed in 5, might return [Element,boolean] so as to provide an indicator of creation? 7. You were just introducing your ideas here, so perhaps you haven't gotten this far yet, but a shortcoming to doing merge(String,Map,Map) is that it leaves open no opportunity to stream a List of Maps to a merge() for a form of batch loading which is mighty common and one of the variations of the coalesce() pattern that I alluded to at the start of all this. I think that we would want to be sure that we left open the option to do that somehow. 8. If we had a general purpose merge() step I wonder if it makes developing the API as you suggested easier to do? I think I'd like to solve the problems you describe in your post as well as the ones in mine. There is some relation there, but gaps as well. With more discussion here we can figure something out. Thanks for starting this talk - good one! On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <[email protected]> wrote: > I've had a few on and off discussions with a few people here, so I wanted > to send this out to everyone for feedback. > > What are people's thoughts on creating a new set of steps that codify > common Gremlin best practices? > > I think there are several common Gremlin patterns where users would benefit > from the additional guidance that these codified steps represent. The > first one I would recommend though is codifying the element existence > pattern into a single Gremlin step, something like: > > mergeV(String, Map, Map) > String - The vertex label > Map (first) - The properties to match existing vertices on > Map (second) - Any additional properties to set if a new vertex is > created (optional) > mergeE(String, Map, Map) > String - The edge label > Map (first) - The properties to match existing edge on > Map (second) - Any additional properties to set if a new edge is > created (optional) > > In each of these cases these steps would perform the same upsert > functionality as the element existence pattern. > > Example: > > g.V().has('person','name','stephen'). > fold(). > coalesce(unfold(), > addV('person'). > property('name','stephen'). > property('age',34)) > > would become: > > g.mergeV('person', {'name': 'stephen'}, {'age', 34}) > > I think that this change would be a good addition to the language for > several reasons: > > * This codifies the best practice for a specific action/recipe, which > reduces the chance that someone uses the pattern incorrectly > * Most complex Gremlin traversals are verbose. Reducing the amount of code > that needs to be written and maintained allows for a better developer > experience. > * It will lower the bar of entry for a developer by making these actions > more discoverable. The more we can help bring these patterns to the > forefront of the language via these pattern/meta steps the more we guide > users towards writing better Gremlin faster > * This allows DB vendors to optimize for this pattern > > I know that this would likely be the first step in Gremlin that codifies a > pattern, so I'd like to get other's thoughts on this? > > Dave >
