PySpark doesn't attempt to support Jython at present. IMO while it might be a 
bit faster, it would lose a lot of the benefits of Python, which are the very 
strong data processing libraries (NumPy, SciPy, Pandas, etc). So I'm not sure 
it's worth supporting unless someone demonstrates a really major performance 

There was actually a recent patch to add PyPy support 
(, which is worth a try if you want 
Python applications to run faster. It might actually be faster overall than 


On Oct 5, 2014, at 10:16 AM, Robert C Senkbeil <> wrote:

> Hi there,
> I wanted to ask whether or not anyone has successfully used Jython with the
> pyspark library. I wasn't sure if the C extension support was needed for
> pyspark itself or was just a bonus of using Cython.
> There was a claim (
> ) that using Jython would be better - if you didn't need C extension
> support - because the cost of serialization is lower. However, I have not
> been able to import pyspark into a Jython session. I'm using version 2.7b3
> of Jython and version 1.1.0 of Spark for reference.
> Jython 2.7b3 (default:e81256215fb0, Aug 4 2014, 02:39:51)
> [Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.7.0_51
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from pyspark import SparkContext, SparkConf
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "pyspark/", line 63, in <module>
>  File "pyspark/", line 25, in <module>
>  File "pyspark/", line 94, in <module>
>  File "pyspark/", line 341, in <module>
>  File "pyspark/", line 328, in _hijack_namedtuple
> RuntimeError: maximum recursion depth exceeded (Java StackOverflowError)
> Is there something I am missing with this? Did Jython ever work for
> pyspark? The same error happens regardless of whether I use the Python
> files or compile them down to Java class files using Jython first.
> I know that previous documentation (0.9.1) indicated, "PySpark requires
> Python 2.6 or higher. PySpark applications are executed using a standard
> CPython interpreter in order to support Python modules that use C
> extensions. We have not tested PySpark with Python 3 or with alternative
> Python interpreters, such as PyPy or Jython."
> In later versions, it now reflects, "Spark 1.1.0 works with Python 2.6 or
> higher (but not Python 3). It uses the standard CPython interpreter, so C
> libraries like NumPy can be used."
> I'm assuming this means that attempts to use other interpreters failed. If
> so, are there any plans to support something like Jython in the future?
> Signed,
> Chip Senkbeil

To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to