GitHub user prabinb opened a pull request:
https://github.com/apache/spark/pull/464
[SPARK-1267]Adding a pip installer setup file for PySpark.
Following changes are made in this pull request,
1) A pip installer python/setup.py file is added.
2) A new file is added, pyspark/pyspark_version.py for maintaining PySpark
version. This needs to be updated, whenever a new version of PySpark is
released.
3) Changed pyspark/__init__.py to cross validate SPARK_HOME variable and
pyspark & spark version mismatch.
This PySpark build distribution has to be registered and uploaded to PyPi
(python package index) for all the releases of PySpark. (somebody needs to
maintain it)
python setup.py register
python setup.py sdist upload
More details on registering and uploading a package can be found here,
https://docs.python.org/2/distutils/packageindex.html
Once the package is uploaded, users will be able to install PySpark by
running,
pip install pyspark
Following validations are added to the import, (all suggestions to improve
it are welcome)
1) For using this package, the user should set SPARK_HOME environment
variable to it's spark installation directory, else the import fails.
2) The spark version in SPARK_HOME/pom.xml file (<project>...<version>spark
version</version>...</project>) should match the pyspark version
(python/pyspark/pyspark_version.py), else the import fails.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/prabinb/spark python-pip
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/464.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #464
----
commit b0af198edb4c3c57ae97f14670777d770d79bef2
Author: prabinb <prabinb@prinhyltphp0133.(none)>
Date: 2014-04-17T11:14:17Z
[SPARK-1267]Adding a pip installer setup file for PySpark. Introduced a new
file pyspark/pyspark_version.py for maintaining PySpark version. Changed
pyspark/__init__.py to cross validate SPARK_HOME variable and pyspark & spark
version mismatch
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---