I like our coalesce() pattern but it is verbose and over time it has gone
from a simple pattern to one with numerous variations for all manner of
different sorts of merge-like operations. As such, I do think we should
introduce something to cover this pattern.

I like that you used the word "merge" in your description of this as it is
the word I've liked using. You might want to give a look at my proposed
merge() step from earlier in the year:

https://lists.apache.org/thread.html/r34ff112e18f4e763390303501fc07c82559d71667444339bde61053f%40%3Cdev.tinkerpop.apache.org%3E

I'm just going to dump thoughts as they come regarding what you wrote:

1. How would multi/meta-properties fit into the API you've proposed?
2. How would users set the T.id on creation? would that T.id just be a key
in the first Map argument?
3. I do like the general idea of a match on multiple properties for the
first argument as a convenience but wonder about the specificity of this
API a bit as it focuses heavily on equality - I suppose that's most cases
for get-or-create, so perhaps that's ok.
4. I think your suggestion points to one of the troubles Gremlin has which
we see with "algorithms" - extending the language with new steps that
provides a form of "sugar" (e.g. in algorithms we end up with
shortestPath() step) pollutes the core language a bit, hence my
generalization of "merging" in my link above which fits into the core
Gremlin language style. There is a bigger picture where we are missing
something in Gremlin that lets us extend the language in ways that let us
easily introduce new steps that aren't for general purpose. This issue is
discussed in terms of "algorithms" here:
https://issues.apache.org/jira/browse/TINKERPOP-1991 but I think we could
see how there might be some "mutation" extension steps that would cover
your suggested API, plus batch operations, etc. We need a way to add
"sugar" without it interfering with the consistency of the core. Obviously
this is a bigger issue but perhaps important to solve to implement steps in
the fashion you describe.
5. I suppose that the reason for mergeE and mergeV is to specify what
element type the first Map argument should be applied to? what about
mergeVP (i.e. vertex property as it too is an element) ? That's tricky but
I don't think we should miss that. Perhaps merge() could be a "complex
modulator"?? that's a new concept of course, but you would do g.V().merge()
and the label and first Map would fold to VertexStartStep (i.e. V()) for
the lookup and then a MergeStep would follow - thus a "complex modulator"
as it does more than just change the behavior of the previous step - it
also adds its own. I suppose it could also add has() steps followed by the
MergeStep and then the has() operations would fold in normally as they do
today. In this way, we can simplify to just one single
merge(String,Map,Map). ??
6. One thing neither my approach nor yours seems to do is tell the user if
they created something or updated something - that's another thing I've
seen users want to have in get-or-create. Here again we go deeper into a
less general step specification as alluded to in 4, but a merge() step as
proposed in 5, might return [Element,boolean] so as to provide an indicator
of creation?
7. You were just introducing your ideas here, so perhaps you haven't gotten
this far yet, but a shortcoming to doing merge(String,Map,Map) is that it
leaves open no opportunity to stream a List of Maps to a merge() for a form
of batch loading which is mighty common and one of the variations of the
coalesce() pattern that I alluded to at the start of all this. I think that
we would want to be sure that we left open the option to do that somehow.
8. If we had a general purpose merge() step I wonder if it makes developing
the API as you suggested easier to do?

I think I'd like to solve the problems you describe in your post as well as
the ones in mine. There is some relation there, but gaps as well. With more
discussion here we can figure something out.

Thanks for starting this talk - good one!



On Wed, Sep 16, 2020 at 9:26 PM David Bechberger <[email protected]>
wrote:

> I've had a few on and off discussions with a few people here, so I wanted
> to send this out to everyone for feedback.
>
> What are people's thoughts on creating a new set of steps that codify
> common Gremlin best practices?
>
> I think there are several common Gremlin patterns where users would benefit
> from the additional guidance that these codified steps represent.  The
> first one I would recommend though is codifying the element existence
> pattern into a single Gremlin step, something like:
>
> mergeV(String, Map, Map)
>      String - The vertex label
>      Map (first) - The properties to match existing vertices on
>      Map (second) - Any additional properties to set if a new vertex is
> created (optional)
> mergeE(String, Map, Map)
>      String - The edge label
>      Map (first) - The properties to match existing edge on
>      Map (second) - Any additional properties to set if a new edge is
> created (optional)
>
> In each of these cases these steps would perform the same upsert
> functionality as the element existence pattern.
>
> Example:
>
> g.V().has('person','name','stephen').
>            fold().
>            coalesce(unfold(),
>                     addV('person').
>                       property('name','stephen').
>                       property('age',34))
>
> would become:
>
> g.mergeV('person', {'name': 'stephen'}, {'age', 34})
>
> I think that this change would be a good addition to the language for
> several reasons:
>
> * This codifies the best practice for a specific action/recipe, which
> reduces the chance that someone uses the pattern incorrectly
> * Most complex Gremlin traversals are verbose.  Reducing the amount of code
> that needs to be written and maintained allows for a better developer
> experience.
> * It will lower the bar of entry for a developer by making these actions
> more discoverable.  The more we can help bring these patterns to the
> forefront of the language via these pattern/meta steps the more we guide
> users towards writing better Gremlin faster
> * This allows DB vendors to optimize for this pattern
>
> I know that this would likely be the first step in Gremlin that codifies a
> pattern, so I'd like to get other's thoughts on this?
>
> Dave
>

Reply via email to