Hello, Gremlin bytecode provides a language agnostic way of sending Gremlin traversals between machines — whether physical or virtual. For instance, it is possible to send bytecode from one JVM to another or from CPython to the JVM across the network. Once bytecode is received, it needs to be translated into a representation that the processing VM can then evaluate.
GremlinServer is smart in that when bytecode is received it will analyze it for lambdas. If there are lambdas, written in language X, then it will use XTranslator and XScriptEngine to evaluate the bytecode and create a Traversal for evaluation. However, if there are no lambdas, then it will use JavaTranslator to create a Traversal for evaluation. So, the question for me is: Is JavaTranslator (which uses Java reflection to convert bytecode to Traversal) faster than GroovyTranslator/GroovyScriptEngine (which creates a String script for and evaluates it in the ScriptEngine)? Lets see. Here is our script in total. import org.apache.tinkerpop.gremlin.jsr223.JavaTranslator import org.apache.tinkerpop.gremlin.groovy.jsr223.GroovyTranslator //// EXECUTED LOCALLY (e.g. CLIENT APPLICATION) //// g = EmptyGraph.instance().traversal() t = g.V().has('name','marko'). repeat(out()).times(2). groupCount().by('name'); [] bytecode = t.bytecode // send the bytecode over the wire //// EXECUTED REMOTELY (e.g. GREMLIN SERVER) //// groovy = new GremlinGroovyScriptEngine() bindings = groovy.createBindings() bindings.put('g',g) compiled = groovy.compile(GroovyTranslator.of('g').translate(bytecode)) x = JavaTranslator.of(g).translate(bytecode); [] y = compiled.eval(bindings); [] z = groovy.eval(GroovyTranslator.of('g').translate(bytecode), bindings); [] x == y y == z z == x x.toString() clock(1000){ JavaTranslator.of(g).translate(bytecode) } clock(1000){ compiled.eval(bindings) } // caching clock(1000){ groovy.reset(); groovy.eval(GroovyTranslator.of('g').translate(bytecode), bindings) } // no caching First, lets make sure they all return the same traversal: gremlin> x = JavaTranslator.of(g).translate(bytecode); [] gremlin> y = compiled.eval(bindings); [] gremlin> z = groovy.eval(GroovyTranslator.of('g').translate(bytecode), bindings); [] gremlin> x == y ==>true gremlin> y == z ==>true gremlin> z == x ==>true gremlin> x.toString() ==>[GraphStep(vertex,[]), HasStep([name.eq(marko)]), RepeatStep([VertexStep(OUT,vertex), RepeatEndStep],until(loops(2)),emit(false)), GroupCountStep(value(name))] gremlin> Great. They do. Now lets see how fast they are. gremlin> clock(1000){ JavaTranslator.of(g).translate(bytecode) } ==>0.004768085 gremlin> clock(1000){ compiled.eval(bindings) } // caching ==>0.015168259 gremlin> clock(1000){ groovy.reset(); groovy.eval(GroovyTranslator.of('g').translate(bytecode), bindings) } // no caching ==>40.790075693 gremlin> Cool. JavaTranslator is about 1000x faster than a evaluating a String script and about 3x faster than evaluating a compiled script. JavaTranslator takes about 40 micro-seconds to translate the bytecode, while an uncached String script takes 40 milliseconds. So, what did we learn? 1. Bytecode is slick in that we don’t have to use Gremlin-Groovy to evaluate it (if there are no lambdas) and thus, can do everything in Java and fast! 2. It very important to always use parameterized queries with GremlinServer/etc. as you can see how costly it is to evaluate a String script repeatedly. What is crazy is that my JavaTranslator code is gheeeeeetto. https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/jsr223/JavaTranslator.java <https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/jsr223/JavaTranslator.java> If anyone wants to submit a PR to make JavaTranslator more efficient, please do. However, we are still doing well with what we have regardless. Take care, Marko. http://markorodriguez.com