Re: [DISCUSS] The Two Protocols of TP4

Jorge Bay Gondra Tue, 23 Apr 2019 03:10:34 -0700

Hi,
I'm still trying to catch up with TP4 topics.

I agree that we can reuse bytecode to submit gremlin string literals,
like [[submit,
[ex:script, gremlin-groovy, g.V.out.name]]]


Instead of supporting a ScriptEngine or enable providers to implement one,
TP4 could be a good opportunity to ditch script engines while continue
supporting gremlin-groovy string literals using language recognition
engines like ANTLR.

Language recognition and parsing engines have several benefits over the
current approach, most notably that it's safe to parse text using language
recognition as it results in string tokens, opposed to let users run code
in a sandboxed vm.

Jorge



On Tue, Apr 16, 2019 at 8:43 PM Marko Rodriguez <okramma...@gmail.com>
wrote:

> Hi,
>
>
> > hmm - it sounds like supporting the vm protocol requires a session. like
> > each "g" from a client needs to hold state on the server between
> requests.
> > or am i thinking about it too concretely and this protocol is more of an
> > abstraction of what's happening?
>
> No, you are right. Its pretty analogous to TP3. The server holds a bunch
> of “g” instances. “g” instances are thread-safe and immutable. Submitted
> bytecode can have a source instruction that references a cached “g” on the
> server (e.g. via a UUID — though this is up to the Machine implementation).
> If it does, then that cached “g” is used to spawn the traversal via the
> operation instructions. Also, this is not just for “over the wire”
> communication. Its not specific to server behavior. The Machine interface
> can be a LocalMachine and still you have this notion of pre-compiled source
> instructions that were machine.registered().
>
>
> https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/species/LocalMachine.java#L41
> <
> https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/species/LocalMachine.java#L41
> >
>
> Finally, if you want to build a Machine that doesn’t pre-compile the
> source instructions, well, this is what your Machine implementation looks
> like:
>
>
> https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/species/BasicMachine.java
> <
> https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/species/BasicMachine.java
> >
>
> Marko.
>
> >
> >
> > On Tue, Apr 16, 2019 at 1:58 PM Marko Rodriguez <okramma...@gmail.com
> <mailto:okramma...@gmail.com>>
> > wrote:
> >
> >> Hi,
> >>
> >>> i get the "submit" part but could you explain the "register" and
> >>> "unregister" parts (referenced in another post somewhere perhaps)?
> >>
> >> These three methods are from the Machine API.
> >>
> >>
> >>
> https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/Machine.java
> <
> https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/Machine.java
> >
> >> <
> >>
> https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/Machine.java
> <
> https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/Machine.java
> >
> >>>
> >>
> >> Bytecode is composed of two sets of instructions.
> >>        - source instructions
> >>        - operation instructions
> >>
> >> source instructions are withProcessor(), withStructure(),
> withStrategy(),
> >> etc.
> >> operation instructions are out(), in(), count(), where(), etc.
> >>
> >> The source instructions are expensive to execute. Why? — when you
> evaluate
> >> a withStructure(), you are creating a connection to the database. When
> you
> >> evaluate a withStrategy(), you are sorting strategies. It is for this
> >> reason that we have the concept of a TraversalSource in TP3 that does
> all
> >> that “setup stuff” once and only once for each g. The reason we tell
> people
> >> to not do graph.traversal().V(), but instead g = graph.traversal(). Once
> >> you have ‘g’, you can then spawn as many traversals as you want off
> that it
> >> without incurring the cost of re-processing the source instructions
> again.
> >>
> >> In TP4, there is no state in Gremlin’s TraversalSource. Gremlin doesn’t
> >> know about databases, processors, strategy compilation, etc. Thus, when
> you
> >> Machine.register(Bytecode) you are sending over the source instructions,
> >> having them processed at the TP4 VM and then all subsequent submits()
> with
> >> the same source instruction header will use the “pre-compiled” source
> >> bytecode cached in the TP4 VM. g.close() basically does
> >> Machine.unregister().
> >>
> >>
> >>
> https://github.com/apache/tinkerpop/blob/tp4/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L107-L112
> >> <
> >>
> https://github.com/apache/tinkerpop/blob/tp4/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L107-L112
> <
> https://github.com/apache/tinkerpop/blob/tp4/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L107-L112
> >
> >>>
> >>
> >>
> https://github.com/apache/tinkerpop/blob/tp4/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L114-L116
> <
> https://github.com/apache/tinkerpop/blob/tp4/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L114-L116
> >
> >> <
> >>
> https://github.com/apache/tinkerpop/blob/tp4/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L114-L116
> <
> https://github.com/apache/tinkerpop/blob/tp4/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L114-L116
> >
> >>>
> >>
> >> In short, we have just offloaded the TP3 TraversalSource work to TP4
> >> Machine.
> >>
> >> HTH,
> >> Marko.
> >>
> >> P.S. I don’t like the term “source instructions.” I’m thinking of
> calling
> >> them “meta instructions” or “setup instructions” or “staging
> instructions’
> >> … ?
> >>
> >>
> >>
> >>
> >>
> >>>
> >>> regarding this:
> >>>
> >>>> just like processing instructions are extended via namespaced
> >>> instructions and strategies, so are server instructions
> >>>
> >>> i was thinking that an extensible bytecode model would be the solution
> >> for
> >>> these kinds of things. without the scriptengine anymore (stoked to see
> >> that
> >>> go away) graph providers with schema languages and other admin
> functions
> >>> will need something to replace that. what's neat about that option is
> >> that
> >>> such features would no longer need to be bound to just the JVM. Python
> >>> users could use the JanusGraph clean utility to drop a database or use
> >>> javscript to create a graph in DSE Graph. pretty cool.
> >>>
> >>>
> >>> On Mon, Apr 15, 2019 at 2:44 PM Marko Rodriguez <okramma...@gmail.com
> >> <mailto:okramma...@gmail.com <mailto:okramma...@gmail.com>>>
> >>> wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> I believe there will only be two protocols in TP4.
> >>>>
> >>>>       1. The VM communication protocol. (Rexster)
> >>>>       2. The data serialization protocol. (Frames)
> >>>>
> >>>> [VM COMMUNICATION PROTOCOL]
> >>>>
> >>>>       1. Register bytecode —returns—> bytecode.
> >>>>       2. Submit bytecode —returns—> iterator of traversers.
> >>>>       3. Unregister bytecode source —returns—> void
> >>>>
> >>>> Here is a trippy idea. These operations are simply bytecode.
> >>>>
> >>>>       1. [[register,[bytecode]]] —returns—> single traverser
> >> referencing
> >>>> bytecode.
> >>>>       2. [[submit, [bytecode]]] —returns—> many traversers referencing
> >>>> primitives.
> >>>>       3. [[unregister, [bytecode]]] —returns —> no traversers.
> >>>>
> >>>> Thus, THE ONLY THING YOU SEND TO THE TP4 VM IS BYTECODE and THE ONLY
> >> THING
> >>>> RETURNED IS ZERO OR MORE TRAVERSERS!
> >>>>
> >>>> Now, think about JanusGraph. It has database operations such as create
> >>>> index, create schema, drop graph, etc. These are just custom
> >> instructions
> >>>> in the bytecode of submit.
> >>>>
> >>>>       [[submit, [[jg:createIndex,people-idx,person]]]
> >>>>
> >>>> A JaunusGraph strategy will know what to do with that instruction and
> a
> >>>> traverser can be returned. Traverser.of(“SUCCESS”). And there you
> have,
> >>>> just like processing instructions are extended via namespaced
> >> instructions
> >>>> and strategies, so are server instructions. Providers have an
> extensible
> >>>> framework to support all their custom operations because, in the end,
> >> its
> >>>> just bytecode, strategies, and resultant traversers! (everything is
> the
> >>>> same).
> >>>>
> >>>> Next, in order to send bytecode and get back traversers ‘over the
> wire',
> >>>> there needs to be a serialization specification.
> >>>>
> >>>> [DATA SERIALIZATION PROTOCOL]
> >>>>
> >>>>       1. I don’t know much about GraphBinary, but I believe its this
> >>>> without complex types.
> >>>>               - Why?
> >>>>                       - bytecode is primitive.
> >>>>                       - traversers are primitive (as they can’t
> >>>> reference complex types — see other [DISCUSS] from today).
> >>>>
> >>>>
> >>>> Thoughts?,
> >>>> Marko.
> >>>>
> >>>> http://rredux.com <http://rredux.com/> <http://rredux.com/ <
> http://rredux.com/>> <http://rredux.com/ <http://rredux.com/> <
> >> http://rredux.com/ <http://rredux.com/>>>
>
>

Re: [DISCUSS] The Two Protocols of TP4

Reply via email to