On the concept of BytecodeStrategies

Marko Rodriguez Thu, 13 Oct 2016 05:37:55 -0700

Hello,

There are two types of “programs” in Gremlin: Bytecode and Traversals.


        Bytecode => Virtual machine instructions (like Java bytecode)
        Traversals => Machine instructions (like Intel machine code)

The core of Gremlin’s compiler is its TraversalStrategies. A traversal strategy 
works on a traversal-by-traversal level walking the traversal tree rewriting 
sections of the traversal into (typically) more optimal forms.

void TraversalStrategy.apply(Traversal<S,E> traversal)

Working at the Traversal object level is important because the Gremlin language 
steps (has(), out(), in(), etc.) don’t always map one-to-one with the machine 
instructions (HasStep, VertexStep, VertexStep). Its better to work at the 
machine-level because there are more nick-nack mutations one can do at that 
level. However, as you can see, traversal strategies are “machine dependent.” 
That is, they are tied to the Gremlin traversal machine implementation.

While there is currently only one Gremlin virtual machine (Gremlin-Java 
machine), there are many Gremlin language variants — Gremlin-Java, -Groovy, 
-Python, SQL-Gremlin, SPARQL-Gremlin, etc. When these languages communicate 
with a/the Gremlin traversal machine, they communicate via Gremlin bytecode. 
Now, it is possible to optimize bytecode. In principle, we can do “client side” 
optimizations on the bytecode prior to sending it to the Gremlin traversal 
machine for execution. Why would we want do this?

        1. We can reduce the amount of work (clock cycles) required of “the 
server” which would ultimately do the TraversalStrategy optimization.
        2. We can have optimizations that are machine independent and thus, can 
be useful against any Gremlin traversal machine implementation.
        3. While the server is “streaming in” the Bytecode, it can also 
optimize the bytecode prior to applying TraversalStrategy optimizations.

[Gremlin-Java Traversal Machine] <== network connection ==> [Gremlin-XXX 
Language Variant]
  * pre-process bytecode                                      * pre-process 
bytecode 
    before translating to traversal                             before sending 
over network                                  
  * apply traversal strategies
  * execute traversal

What would Bytecode strategies look like? Here is an idea:

void TraversalStrategy.apply(Bytecode bytecode)

Lets look at a simple strategy. IdentityRemoveStrategy will turn traversals of 
the form g.V().identity().as(“a”).identity() into g.V().as(“a”). Here is this 
strategy written in both Java and Python:

        https://gist.github.com/okram/7bb2512935f8955551f9e3f87623b488 
<https://gist.github.com/okram/7bb2512935f8955551f9e3f87623b488>

Given that there (currently) is no Gremlin-Python traversal machine 
implementation, __apply_traversal(traversal) does nothing. However, given that 
there is a Gremlin-Python language variant, __apply_bytecode(traversal) does 
something. Moreover, note that we already have IdentityRemovalStrategy in 
Gremlin-Python, but, as you can see, it does nothing as (currently) strategies 
only operate on traversals.

        
https://github.com/apache/tinkerpop/blob/master/gremlin-python/src/main/jython/gremlin_python/process/strategies.py#L117-L119
 
<https://github.com/apache/tinkerpop/blob/master/gremlin-python/src/main/jython/gremlin_python/process/strategies.py#L117-L119>

AS A SIDE: The reason strategies exists in Gremlin-Python is so that users can 
do stuff like:
        
https://github.com/apache/tinkerpop/blob/master/gremlin-python/src/main/jython/tests/driver/test_driver_remote_connection.py#L80-L98
 
<https://github.com/apache/tinkerpop/blob/master/gremlin-python/src/main/jython/tests/driver/test_driver_remote_connection.py#L80-L98>

Anywho, so there you have it. I’ve made a ticket:
        https://issues.apache.org/jira/browse/TINKERPOP-1501 
<https://issues.apache.org/jira/browse/TINKERPOP-1501>

You thoughts on the idea are more than appreciated.

Take care,
Marko.

http://markorodriguez.com

On the concept of BytecodeStrategies

Reply via email to