Warning about poor interactions between PySpark and numexpr

Michael Ronquest Tue, 10 Dec 2013 08:00:08 -0800

Hi Everyone,

I've recently run into some unpleasantness with PySparkwhen trying to use a pandas DataFrame *inside* a mapPartitionsfunction. I've traced the error to numexpr (which pandas uses) andsubmitted a bug here:

       https://code.google.com/p/numexpr/issues/detail?id=123

Bottom line: platform.machine's os.popen call fails upon close when rununder PySpark. Out of curiosity, has anyone run into something similarand have a solution? Right now, I've been forced to patch numexpr inorder to prevent the call to platform.machine() as mentioned in theabove bug report.


Thanks,
Mike

Warning about poor interactions between PySpark and numexpr

Reply via email to