Hi,
One point of clarity.
You might ask:
“Why not just sort strategies in TraversalSource like in TP3? — like
DefaultTraversalStrategies.class”
Answer:
TraversalSource is Gremlin-specific. TP4 is seeing Gremlin as simply an
implementation of the Builder-pattern for Bytecode. No other state. By moving
pre-compilation to the Machine interface, other languages get pre-compilation
for free.
Marko.
> On Mar 27, 2019, at 7:00 AM, Marko Rodriguez <[email protected]> wrote:
>
> Hi,
>
>> LocalMachine, it will lookup the registered UUID and if it exists, use the
>> pre-compiled source code.
>
> So what Machine.register() does generally, is up to the implementation.
>
> LocalMachine.register() does what TP3 does in TraversalSource. It
> “pre-compiles”.
>
> - sort strategies
> TP3:
> https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalSource.java#L138
>
> <https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalSource.java#L138>
>
> https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/util/DefaultTraversalStrategies.java#L47
>
> <https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/util/DefaultTraversalStrategies.java#L47>
> - sets up processor
> TP3:
> https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalSource.java#L141
>
> <https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalSource.java#L141>
> - sets up structure
> TP3: this came for free because we created the traversal source
> from Graph.traversal().
>
> This way when you keep spawning traversals off of the same “g” we don’t have
> to re-compile the source instructions.
>
>> maybe i didn't follow properly but is this for the purpose of caching
>> traversals to avoid the costs of traversal to bytecode compilation?
>
>
> Note this is a SourceCompilation (just the source instructions are compiled),
> not the full instructions which is a Compilation.
>
>> in other words, is this describing a general way to cache compiled bytecode
>> so
>> that it doesn't have to go through strategy application more than once?
>
>
> To the concept of caching traversals, that is easy to do with the Machine
> interface. On Machine.submit(), a Map<Bytecode,Compilation> can exist. Same
> as TP3. However, check this, we can do it another way. Why even send the full
> Bytecode? If the RemoteMachine (which is local to the client) knows it
> already sent the same Bytecode before, it can send a single instruction
> Bytecode with an encoded UUID-like instruction. Thus, Map<UUID,Compilation>.
> Less data to transfer.
>
> RemoteMachine (client side) can keep a Map<Bytecode,UUID> and do the proper
> UUID-encoding.
> MachineServer (server side) can then Map<Bytecode,Compilation>, where if the
> received Bytecode is a single UUID-like instruction, fast lookup. If not, can
> still look it up!
>
> Thus, it is easy for us to do both types of caching with the Machine
> interface:
>
> SourceCompilation: source bytecode caching.
> Compilation: full bytecode caching.
>
> Keep the questions coming.
>
> Marko.
>
> http://markorodriguez.com <http://markorodriguez.com/>
>
>
>>
>>
>>
>> On Mon, Mar 25, 2019 at 8:48 AM Marko Rodriguez <[email protected]
>> <mailto:[email protected]>>
>> wrote:
>>
>>> Hi,
>>>
>>> Here is how the TP4 bytecode submission infrastructure is looking.
>>>
>>> In TP3, TraversalSource maintained the “pre-compilation” of strategies,
>>> database connectivity, etc. This was not smart for the following reasons:
>>>
>>> 1. It assumed the traversal would execute on the same machine that
>>> it was created on.
>>> 2. We had to make an explicit distinction between local and remote
>>> execution via RemoteStrategy.
>>> 3. RemoteStrategy passes an excessive amount of data over the wire
>>> on each traversal submission (the source instructions!).
>>> 4. RemoteStrategy is bug prone with traversal inspection and
>>> RemoteStep, etc.
>>>
>>> In TP4, we are now going to assume that Bytecode (a traversal) is always
>>> submitted somewhere and this “somewhere" could be local or remote. This
>>> “somewhere” must implement the Machine interface.
>>>
>>>
>>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/Machine.java
>>>
>>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/Machine.java>
>>>
>>> Machine makes explicit the TP4 communication protocol. The only objects
>>> being transmitted are either Bytecode or Traversers. Simple.
>>>
>>> Here is an example using LocalMachine:
>>>
>>> Machine machine = LocalMachine.open();
>>> TraversalSource g =
>>> Gremlin.traversal(machine).withProcessor(…).withStructure(…).withStrategy(…)
>>>
>>> The first time a traversal is generated from g, the Bytecode source
>>> instructions are registered with the machine.
>>>
>>>
>>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L99-L104
>>> <
>>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L99-L104
>>>
>>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L99-L104>
>>>>
>>>
>>> The intention is that, on registration, the Machine will pre-compile the
>>> source instructions (sort strategies, ensure processor and structure
>>> setup/connectivity). Machine.register() returns a new Bytecode which
>>> contains the registration information for future lookup. This registration
>>> information is Machine-specific and can even be nothing!
>>>
>>>
>>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/BasicMachine.java#L37-L39
>>>
>>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/BasicMachine.java#L37-L39>
>>> <
>>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/BasicMachine.java#L37-L39
>>>
>>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/BasicMachine.java#L37-L39>
>>>>
>>>
>>> However, more intelligently, LocalMachine maintains a
>>> Map<UUID,SourceCompilation> which maintains pre-compiled source
>>> instructions.
>>>
>>>
>>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/LocalMachine.java#L47-L62
>>>
>>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/LocalMachine.java#L47-L62>
>>> <
>>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/LocalMachine.java#L47-L62
>>>
>>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/LocalMachine.java#L47-L62>
>>>>
>>>
>>> Now when bytecode (containing instructions for execution) is submitted to
>>> LocalMachine, it will lookup the registered UUID and if it exists, use the
>>> pre-compiled source code. As you can see, a pre-compilation has everything
>>> staged and ready for use.
>>>
>>>
>>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/bytecode/compiler/SourceCompilation.java
>>>
>>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/bytecode/compiler/SourceCompilation.java>
>>> <
>>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/bytecode/compiler/SourceCompilation.java
>>>
>>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/bytecode/compiler/SourceCompilation.java>
>>>>
>>>
>>> For remote execution, we simply need RemoteMachine which would serialize
>>> and deserialize Bytecode and Traversers to some RemoteMachineServer (or
>>> some provider-specific server able to use the basic protocol we will
>>> develop). For instance:
>>>
>>> Machine machine = RemoteMachine.open(Map.of(“ip”,”127.0.0.1”,”port”,”32”))
>>> TraversalSource g =
>>> Gremlin.traversal(machine).withProcessor(…).withStructure(…).withStrategy(…)
>>>
>>> // prior to V(), the bytecode is registered and a new “registration”
>>> bytecode is returned and appended with V and count instructions.
>>> g.V().count()
>>>
>>> // no registration occurs as the TraversalSource hasn’t changed, the
>>> bytecode is simply submitted.
>>> g.V().out().count()
>>>
>>> // the remote registration is removed
>>> g.close()
>>>
>>> // a new registration occurs
>>> g = g.withStrategy(…)
>>> g.V().drop()
>>>
>>> // the remote registration is removed
>>> g.close()
>>>
>>> Tada!
>>>
>>> WDYT?,
>>> Marko.
>>>
>>> http://rredux.com <http://rredux.com/> <http://rredux.com/
>>> <http://rredux.com/>>
>