Hi,
> LocalMachine, it will lookup the registered UUID and if it exists, use the
> pre-compiled source code.
So what Machine.register() does generally, is up to the implementation.
LocalMachine.register() does what TP3 does in TraversalSource. It
“pre-compiles”.
- sort strategies
TP3:
https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalSource.java#L138
<https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalSource.java#L138>
https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/util/DefaultTraversalStrategies.java#L47
<https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/util/DefaultTraversalStrategies.java#L47>
- sets up processor
TP3:
https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalSource.java#L141
<https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/TraversalSource.java#L141>
- sets up structure
TP3: this came for free because we created the traversal source
from Graph.traversal().
This way when you keep spawning traversals off of the same “g” we don’t have to
re-compile the source instructions.
> maybe i didn't follow properly but is this for the purpose of caching
> traversals to avoid the costs of traversal to bytecode compilation?
Note this is a SourceCompilation (just the source instructions are compiled),
not the full instructions which is a Compilation.
> in other words, is this describing a general way to cache compiled bytecode
> so
> that it doesn't have to go through strategy application more than once?
To the concept of caching traversals, that is easy to do with the Machine
interface. On Machine.submit(), a Map<Bytecode,Compilation> can exist. Same as
TP3. However, check this, we can do it another way. Why even send the full
Bytecode? If the RemoteMachine (which is local to the client) knows it already
sent the same Bytecode before, it can send a single instruction Bytecode with
an encoded UUID-like instruction. Thus, Map<UUID,Compilation>. Less data to
transfer.
RemoteMachine (client side) can keep a Map<Bytecode,UUID> and do the proper
UUID-encoding.
MachineServer (server side) can then Map<Bytecode,Compilation>, where if the
received Bytecode is a single UUID-like instruction, fast lookup. If not, can
still look it up!
Thus, it is easy for us to do both types of caching with the Machine interface:
SourceCompilation: source bytecode caching.
Compilation: full bytecode caching.
Keep the questions coming.
Marko.
http://markorodriguez.com <http://markorodriguez.com/>
>
>
>
> On Mon, Mar 25, 2019 at 8:48 AM Marko Rodriguez <[email protected]
> <mailto:[email protected]>>
> wrote:
>
>> Hi,
>>
>> Here is how the TP4 bytecode submission infrastructure is looking.
>>
>> In TP3, TraversalSource maintained the “pre-compilation” of strategies,
>> database connectivity, etc. This was not smart for the following reasons:
>>
>> 1. It assumed the traversal would execute on the same machine that
>> it was created on.
>> 2. We had to make an explicit distinction between local and remote
>> execution via RemoteStrategy.
>> 3. RemoteStrategy passes an excessive amount of data over the wire
>> on each traversal submission (the source instructions!).
>> 4. RemoteStrategy is bug prone with traversal inspection and
>> RemoteStep, etc.
>>
>> In TP4, we are now going to assume that Bytecode (a traversal) is always
>> submitted somewhere and this “somewhere" could be local or remote. This
>> “somewhere” must implement the Machine interface.
>>
>>
>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/Machine.java
>>
>> Machine makes explicit the TP4 communication protocol. The only objects
>> being transmitted are either Bytecode or Traversers. Simple.
>>
>> Here is an example using LocalMachine:
>>
>> Machine machine = LocalMachine.open();
>> TraversalSource g =
>> Gremlin.traversal(machine).withProcessor(…).withStructure(…).withStrategy(…)
>>
>> The first time a traversal is generated from g, the Bytecode source
>> instructions are registered with the machine.
>>
>>
>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L99-L104
>> <
>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L99-L104
>>
>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/language/gremlin/src/main/java/org/apache/tinkerpop/language/gremlin/TraversalSource.java#L99-L104>
>>>
>>
>> The intention is that, on registration, the Machine will pre-compile the
>> source instructions (sort strategies, ensure processor and structure
>> setup/connectivity). Machine.register() returns a new Bytecode which
>> contains the registration information for future lookup. This registration
>> information is Machine-specific and can even be nothing!
>>
>>
>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/BasicMachine.java#L37-L39
>>
>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/BasicMachine.java#L37-L39>
>> <
>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/BasicMachine.java#L37-L39
>>
>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/BasicMachine.java#L37-L39>
>>>
>>
>> However, more intelligently, LocalMachine maintains a
>> Map<UUID,SourceCompilation> which maintains pre-compiled source
>> instructions.
>>
>>
>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/LocalMachine.java#L47-L62
>>
>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/LocalMachine.java#L47-L62>
>> <
>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/LocalMachine.java#L47-L62
>>
>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/LocalMachine.java#L47-L62>
>>>
>>
>> Now when bytecode (containing instructions for execution) is submitted to
>> LocalMachine, it will lookup the registered UUID and if it exists, use the
>> pre-compiled source code. As you can see, a pre-compilation has everything
>> staged and ready for use.
>>
>>
>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/bytecode/compiler/SourceCompilation.java
>>
>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/bytecode/compiler/SourceCompilation.java>
>> <
>> https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/bytecode/compiler/SourceCompilation.java
>>
>> <https://github.com/apache/tinkerpop/blob/596caf3ab82f3b15c2c343af87be6d03f26d6d6e/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/bytecode/compiler/SourceCompilation.java>
>>>
>>
>> For remote execution, we simply need RemoteMachine which would serialize
>> and deserialize Bytecode and Traversers to some RemoteMachineServer (or
>> some provider-specific server able to use the basic protocol we will
>> develop). For instance:
>>
>> Machine machine = RemoteMachine.open(Map.of(“ip”,”127.0.0.1”,”port”,”32”))
>> TraversalSource g =
>> Gremlin.traversal(machine).withProcessor(…).withStructure(…).withStrategy(…)
>>
>> // prior to V(), the bytecode is registered and a new “registration”
>> bytecode is returned and appended with V and count instructions.
>> g.V().count()
>>
>> // no registration occurs as the TraversalSource hasn’t changed, the
>> bytecode is simply submitted.
>> g.V().out().count()
>>
>> // the remote registration is removed
>> g.close()
>>
>> // a new registration occurs
>> g = g.withStrategy(…)
>> g.V().drop()
>>
>> // the remote registration is removed
>> g.close()
>>
>> Tada!
>>
>> WDYT?,
>> Marko.
>>
>> http://rredux.com <http://rredux.com/> <http://rredux.com/
>> <http://rredux.com/>>