[ https://issues.apache.org/jira/browse/TINKERPOP-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15294262#comment-15294262 ]
Dylan Bethune-Waddell edited comment on TINKERPOP-1230 at 5/20/16 9:28 PM: --------------------------------------------------------------------------- I did some looking into this for Groovy Closures and found a few resources that seem to highlight the problem with trying to serialize a groovy closure and a workable solution to it. The issues: 1) Some probably important internal fields (e.g. owner, delegate, thisObject) are not guaranteed to implement Serializable or be serializable and must be stripped - groovy provides dehydrate()/rehydrate() for this (see https://issues.apache.org/jira/browse/GROOVY-5151) 2) As Stephen pointed out it seems the big problem is getting the bytecode for the closure to go across the wire too and then be deserialized and loaded properly at the other end without exploding. These two blog posts, where the latter picked up on the work of the former, build towards the solution. 1) http://seeallhearall.blogspot.ca/2012/01/remoting-groovy-with-generated-closures.html - Difficulty of acquiring the closure byte code and potential red herrings out there on the internet - Retrieve a closure's bytecode using java.lang.instrument.ClassFileTransformer - Attach a java agent to a running JVM not started with -javaagent -> access instrumentation instance - Overall, getting a closure's bytecode reduced to this: {code:java} import org.helios.gmx.classloading.* foo = {message -> println message} bytes = ByteCodeRepository.getInstance().getByteCode(foo.getClass()) println "Class:${foo.getClass().getName()} ByteCode:${bytes.length} bytes" {code} 2) Expanding on (1), http://thegridman.com/uncategorized/groovy-oracle-coherence-yeah-baby/ - Using the ByteCodeRepository class from GroovyMX mentioned above to get Closure bytecode - Implements a GroovyClosureSerializer class (POF, KryoSerializer would be similar right?) - Implements a GroovyClosureClassLoader ensuring bytecode is used when the closure is deserialized A slightly abridged passage from article (2), because it sums up the whole approach and I'm certainly not going to put it in my own words any better: "In effect what the GroovyMX does is use the Java Agent API to attach to the current process and then use the instrumentation API to be able to intercept class loading and see byte code. Rather than copy the techniques or pull out bits of code it was easier at this point to just include the GroovyMX jar as a dependency of my code and use the couple of classes I needed directly. To obtain the byte code of a class GroovyMX contains a class called org.helios.gmx.util.ByteCodeNet which has a method called getClassBytes. This method returns a Map keyed on class name with values as byte[] which are then easy to POF serialize. As you can see it is pretty simple with the addition of two lines to the serialize method. We can also change the deserialize method to deserialize the byte code. But we still have a problem as what do we do with the byte code to make sure it gets used. The obvious answer is we need a special ClassLoader that we can pass this byte code to and that will be used when we deserialize the Closure. Now we can add this [GroovyClosureClassLoader] to our derserialize method and make our DefaultSerializer use this ClassLoader instead of the context ClassLoader." He goes on to show this working in action. Hope this helps somewhat. was (Author: dylanht): I did some looking into this for Groovy Closures and found a few resources that seem to highlight the problem with trying to serialize a groovy closure and a workable solution to it. The issues: 1) Some probably important internal fields (e.g. owner, delegate, thisObject) are not guaranteed to implement Serializable or be serializable and must be stripped - groovy provides dehydrate()/rehydrate() for this (see https://issues.apache.org/jira/browse/GROOVY-5151) 2) As Stephen pointed out it seems the big problem is getting the bytecode for the closure to go across the wire too and then be deserialized and loaded properly at the other end without exploding. These two blog posts, where the latter picked up on the work of the former, build towards the solution. 1) http://seeallhearall.blogspot.ca/2012/01/remoting-groovy-with-generated-closures.html - Difficulty of acquiring the closure byte code and potential red herrings out there on the internet - Retrieve a closure's bytecode using java.lang.instrument.ClassFileTransformer - Attach a java agent to a running JVM not started with -javaagent -> access instrumentation instance - Overall, getting a closure's bytecode reduced to this: {code:groovy} import org.helios.gmx.classloading.* foo = {message -> println message} bytes = ByteCodeRepository.getInstance().getByteCode(foo.getClass()) println "Class:${foo.getClass().getName()} ByteCode:${bytes.length} bytes" {code} 2) Expanding on (1), http://thegridman.com/uncategorized/groovy-oracle-coherence-yeah-baby/ - Using the ByteCodeRepository class from GroovyMX mentioned above to get Closure bytecode - Implements a GroovyClosureSerializer class (POF, KryoSerializer would be similar right?) - Implements a GroovyClosureClassLoader ensuring bytecode is used when the closure is deserialized A slightly abridged passage from article (2), because it sums up the whole approach and I'm certainly not going to put it in my own words any better: "In effect what the GroovyMX does is use the Java Agent API to attach to the current process and then use the instrumentation API to be able to intercept class loading and see byte code. Rather than copy the techniques or pull out bits of code it was easier at this point to just include the GroovyMX jar as a dependency of my code and use the couple of classes I needed directly. To obtain the byte code of a class GroovyMX contains a class called org.helios.gmx.util.ByteCodeNet which has a method called getClassBytes. This method returns a Map keyed on class name with values as byte[] which are then easy to POF serialize. As you can see it is pretty simple with the addition of two lines to the serialize method. We can also change the deserialize method to deserialize the byte code. But we still have a problem as what do we do with the byte code to make sure it gets used. The obvious answer is we need a special ClassLoader that we can pass this byte code to and that will be used when we deserialize the Closure. Now we can add this [GroovyClosureClassLoader] to our derserialize method and make our DefaultSerializer use this ClassLoader instead of the context ClassLoader." He goes on to show this working in action. Hope this helps somewhat. > Serialising lambdas for RemoteGraph > ----------------------------------- > > Key: TINKERPOP-1230 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1230 > Project: TinkerPop > Issue Type: Improvement > Components: driver, server > Affects Versions: 3.1.1-incubating > Reporter: Michael Pollmeier > Priority: Minor > > I just made an attempt to serialise lambdas and send them via the > RemoteGraph. I didn't quite get there, but wanted to share my findings: > * it's possible to serialise lambdas on the jvm by just extending > `Serializable`: > http://stackoverflow.com/questions/22807912/how-to-serialize-a-lambda/22808112#22808112 > * sending a normal predicate doesn't work (this is a Scala REPL but it should > be pretty easy to convert this to java/groovy) > val g = RemoteGraph.open("conf/remote-graph.properties").traversal() > val pred1 = new java.util.function.Predicate[Traverser[Vertex]] { def > test(v: Traverser[Vertex]) = true } > g.V().filter(pred1).toList > // java.lang.RuntimeException: java.io.NotSerializableException: $anon$1 > // on server: nothing > > * simply adding Serializable let's us send it over the wire, but the server > doesn't deserialise it > val pred2 = new java.util.function.Predicate[Traverser[Vertex]] with > Serializable { def test(v: Traverser[Vertex]) = true } > g.V().filter(pred2).toList > // on server: [WARN] OpExecutorHandler - Could not deserialize the > Traversal instance > org.apache.tinkerpop.gremlin.server.op.OpProcessorException: Could > not deserialize the Traversal instance > at > org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor.iterateOp(TraversalOpProcessor.java:135) > at > org.apache.tinkerpop.gremlin.server.handler.OpExecutorHandler.channelRead0(OpExecutorHandler.java:68) > // on client: > org.apache.tinkerpop.gremlin.driver.exception.ResponseException: $anon$1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)