Github user jhlch commented on the issue:

    https://github.com/apache/spark/pull/8318
  
    I've got [a branch that has a solid first pass at making pyspark pip 
installable.](https://github.com/apache/spark/compare/master...jhlch:pipinstall)
 A few questions are: 
    
    * How does this integrate with the typical build? Once the jar is built it 
needs to be put in a location pointed to by setup.py and MANIFEST.in.
    * What version requirements are there for numpy and pandas? I'm not 
confident that the one I list are correct or as specific as they could be.
    * Setup automated testing:
    
        * run-tests and run-tests.py should use environments where pyspark has 
been pip installed and remove the 'find jars' etc thing it currently does.
        * testpypi exists and could be useful in CI to make sure packaging and 
distribution never break. CI python envs could be initialized using `pip 
install --extra-index-url https://testpypi.python.org/pypi pyspark`
    
    I've got too much on my plate to see this to the finish line in the next 
few months, but I do want to see this happen. Is someone else willing to take 
it from here? If not, I'll come back to it in Dec/Jan.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to