Github user dalaro commented on the issue: https://github.com/apache/incubator-tinkerpop/pull/325 I just pushed some changes that I hacked together this weekend. The key additions are: * `TinkerPopKryoRegistrator`, which I extracted from my app, and which acts as a `spark.kryo.registrator` impl that knows about TinkerPop types * `IoRegistryAwareKryoSerializer`, which is a Spark `Serializer` that looks for `GryoPool.CONFIG_IO_REGISTRY` and applies it if present * `KryoShimLoaderService.applyConfiguration(cfg)`, which replaces direct calls to `HadoopPools.initialize(cfg)` and adds equivalent functionality for initializing the unshaded Kryo serializer pool The user would theoretically just set ``` spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.IoRegistryAwareKryoSerializer spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.TinkerPopKryoRegistrator # Optional, only needed for custom types gremlin.io.registry=whatever.user.IoRegistryImpl ``` In practice, when I have a custom gremlin.io.registry, I have always had to take the additional step (long before this PR) of forcibly initializing `HadoopPools` before touching SparkGraphComputer in my app, or else some part of Spark -- I think the closure serializer -- would attempt to use HadooPools via ObjectWritable/VertexWritable before initialization and produce garbage on my custom classes. **This problem predates my PR**. I'm not trying to solve it here, in part because I still don't know if it's a pathology specific to my app or because TinkerPop is missing a crucial `HadoopPools.initialize` (now, equivalently, `KryoShimLoaderService.applyConfiguration`) call somewhere, and in part because HadoopPools is such a hideous architectural wart that the ultimate solution probably involves destroying it. In the past, I've worked around this by defining a custom spark.serializer that delegates newKryo() to a GryoSerializer/IoRegistryAwareSerializer, but which has a constructor that invokes `HadoopPools.initialize`/`KryoShimLoaderService.applyConfiguration` (relying on that method's idempotence). Again, this initialization step just be specific to my app and unnecessary for the average TinkerPop user. It's possible that the config I pasted above will work for others. FWIW, this passes, so the overrides bug should be fixed along with all this refactoring stuff: ``` mvn clean install -DskipTests=true && mvn verify -pl gremlin-server -DskipIntegrationTests=false -Dtest.single=GremlinResultSetIntegrateTest ```
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---