Good point about Gremlin Server being JVM-only. I think we might end up with a solution in which we have a process schema that constrains legal programs and makes them first-class citizens for the purpose of data exchange, and in which we also have certain core implementations in a neutral language like Idris which can be mapped into each GLV. A side-effect of constraining programs to a schema might be, as you note, that Groovy closures and other custom code would not be permitted in general. That's good from the perspective of portability and optimization, though IMO it may be too restrictive. I think we should be able to come up with some kind of hook in each language that will permit custom code, with the caveat that all the Gremlin VM can do with such a program is execute it; we can't send it over the wire etc.
Still very much TBD is how we end up with the right interfaces, and make use of the right language-specific APIs for each GLV. For example, if Thrift is used, developers will want to read and write data using an existing Thrift API appropriate to the programming language. Josh On Mon, Jan 13, 2020 at 5:48 AM Stephen Mallette <[email protected]> wrote: > On the heels of the "structure API" discussion I thought i'd just start a > thread for the process api for TP4. The discussion of serialization formats > like Thrift made me think about Gremlin Server, GLVs and their overall > relationship to the process API (i.e. Gremlin then language). > > I don't think TP4 should have a specific application called "Gremlin > Server". Many graph providers don't use the component directly and it > creates a component that only exists for the JVM but not in other language > ecosystems. As a result I think I've noticed that users immediately start > with a point of confusion as to what they need to get started with > TInkerPop (despite all the documentation and explanation we provide). > > Let's forget about all the different graph systems out there and just think > about the basic TP3 one-liner for creating "g": > > g = traversal() > > If every language variant could support that syntax equally with no other > configurations we'd have something really easy to get started with. Perhaps > traversal() would just instantiate an empty embedded in-memory graph > (yes...that would mean having some form of TinkerGraph in each language) > but that graph would communicate over the same protocol as though it were > remote. From that simple start point we can start extending into remote > configurations to explicitly connect to specific graphs in specific ways, > in much the same manner as we do today. I think this approach implies that > graphs which are purely embedded today will need to expose Gremlin > Server-style functionality to be considered TinkerPop-enabled or perhaps we > can just wrap up their implementation inside of TinkerPop somehow to expose > that for them. Whether they do a native implementation which might afford > them some benefits based on their platform or rely on our implementation > puts the user in a position where they no longer need to reason about that > component which is essentially the goal I'd like to achieve. > > Note that we will no longer look to support arbitrary groovy script > execution as part of TP4. If graph providers rely on that functionality for > some reason they will need to account for that. Providers often support > scripts to allow for their schema APIs to work. Given that TP4 will have > schema support I would think that they would piggy-back on whatever > infrastructure we supplied in support of that, but if there are other > features needed (DS Graph has some "system" functions for example). those > will have to be dealt with in some way. I think certain providers like > visualization tools and notebooks that support Gremlin may also hit some > problems with this change. I think that the answer is pretty simple > though...providers will just need to manage their own ScriptEngine > implementations along with all the security/memory issues that comes with > that. I considered the notion that we might maintain gremlin-groovy and its > ScriptEngine but not expose it as a "server" oriented feature because > Gremlin Console kept us bound to groovysh and a lot of that code overlaps, > however with Java now having it's own shell, I wonder if we need to touch > Groovy at all. If we were going to support a JVM language variant I'd > probably pick a few other languages first like Clojure or Scala where the > Java interop isn't as clean as Groovy's. > > I suppose that's just scratching the surface of things to consider for the > process API for TP4 but these were the things that came to mind while > thinking about the other thread. > > > > On Mon, Jan 13, 2020 at 7:53 AM Stephen Mallette <[email protected]> > wrote: > > > Thanks for trying out Idris. I had a feeling it would work the way that > > you found it to but without actually trying it out there would be no way > to > > know for sure. > > > > Interesting idea to use thrift to generate process classes like steps. > > Having some foundational code could be helpful in starting up and > > maintaining a GLV. With Idris I'd hoped to get more than just some > > interfaces and some core code that could supply some working logic to > every > > language ecosystem we supported but perhaps that was asking too much. > > > > I've looked at Thrift before as a possible serialization format for use > to > > use with Gremlin Server but given the adherence to schema that it > required > > I opted away from it. Given that we now look to have the notion of a > schema > > in TP4 I suppose Thrift, protocolbuffers and other such formats and > > protocols are back on the table for consideration. There is a whole > > separate discussion to be had about "Gremlin Server" and the methods by > > which users "connect" to a graph for TP4, but perhaps I will save that > for > > a separate thread so as not to redirect this one too much. > > > > On Fri, Jan 10, 2020 at 1:43 PM Joshua Shinavier <[email protected]> > [...]
