Github user Stibbons commented on the issue:

    https://github.com/apache/spark/pull/14963
  
    I would love to have a bit more feedback on this matter but it does not 
seem to interest core developers, sadly :(
    It's a bit disappointing, seeing how Python support on Spark is great, 
being able to deploy job as easy as with java (i.e., developers prepares the 
job + describes all the dependencies independently) would be so useful for 
Spark. For the moment, we have to ask our IT guys to install a given python 
module on each Executor when one needs a new one. We found a way using a nfs 
share but this is not convient. Automatic virtualenv creation with pip 
installing all dependencies described in a requirements.txt + support for 
wheels and python distribution package (sdist or bdist) is so useful and 
scalable. Each job can use different libraries and even the same library with 
different version, just like what can occur for java jars with the --packages 
arguments.
    I think of starting maintaining a fork of Spark with latest version of the 
source code, I would call it "Python Friendly Spark".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to