Hi. As part of my attempt to port Pyspark to Python 3, I've re-applied, with modifications, Josh's old commit for using Dill with Pyspark (as Dill already supports Python 3). Alas, I ran into an odd problem that I could use some help with.
Josh's old commit; https://github.com/JoshRosen/incubator-spark/commit/2ac8986f3009f0dc133b11d16887fc8ddb33c3d1 My Dill branch; https://github.com/distobj/spark/tree/dill (Note; I've been running this in a virtualenv into which I pip-installed dill. I haven't yet figured out the new way to package it in python/lib as was done for py4j) So the problem is that run_tests is failing with this pickle.py error on most of the tests (those using .cache() it seems, unsurprisingly); PicklingError: Can't pickle <type '_sre.SRE_Pattern'>: it's not found as _sre.SRE_Pattern What's odd is that the same doctests work fine when run from the shell. TIA for any ideas...