[
https://issues.apache.org/jira/browse/TOREE-337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Semet updated TOREE-337:
------------------------
Description:
I would like to volunteer to work on add two "magic" to Toree, related to
Python dependency managements:
{code}
%AddPythonDeps
Usage: %AddPythonDeps pypi_package_name [version]
{code}
Download and install Python dependency from pypi.python.org and install it on
the Spark cluster using pip. Transitive dependencies will be automatically
retrieved.
---
{code}
%AddPythonDist
Usage: %AddPythonDist<url to dist>
{code}
Download the distribution package (source distribution package or binary
distribution package or wheel) and install it on the master and workers using
pip.
---
Example:
One would be able to specify something like
{code}
%AddPythonDeps numpy 1.1.1
%AddPythonDeps requests 0.1.1
{code}
I am working to bring pip and virtualenv support to Spark since we need to work
on clean virtualenv. Here is a [proposal on this
subject|http://www.great-a-blog.co/wheel-deployment-for-pyspark/] and the [pull
request|https://github.com/apache/spark/pull/14180].
Let me know if it would make sense within Apache Toree to have such feature (I
think it's cool to let each job describes its python dependencies and have
Spark automatically handle them properly!)
was:
I would like to volunteer to add two "magic" to Toree, related to Python
dependency managements:
{code}
%AddPythonDeps
Usage: %AddPythonDeps pypi_package_name [version]
{code}
Download and install Python dependency from pypi.python.org and install it on
the Spark cluster using pip. Transitive dependencies will be automatically
retrieved.
---
{code}
%AddPythonDist
Usage: %AddPythonDist<url to dist>
{code}
Download the distribution package (source distribution package or binary
distribution package or wheel) and install it on the master and workers using
pip.
---
Example:
One would be able to specify something like
{code}
%AddPythonDeps numpy 1.1.1
%AddPythonDeps requests 0.1.1
{code}
I am working to bring pip and virtualenv support to Spark since we need to work
on clean virtualenv. Here is a [proposal on this
subject|http://www.great-a-blog.co/wheel-deployment-for-pyspark/] and the [pull
request|https://github.com/apache/spark/pull/14180].
Let me know if it would make sense within Apache Toree to have such feature (I
think it's cool to let each job describes its python dependencies and have
Spark automatically handle them properly!)
> %AddPythonDeps magic to install packages from Pypi
> --------------------------------------------------
>
> Key: TOREE-337
> URL: https://issues.apache.org/jira/browse/TOREE-337
> Project: TOREE
> Issue Type: Improvement
> Reporter: Semet
> Labels: python,, python-wheel
>
> I would like to volunteer to work on add two "magic" to Toree, related to
> Python dependency managements:
> {code}
> %AddPythonDeps
> Usage: %AddPythonDeps pypi_package_name [version]
> {code}
> Download and install Python dependency from pypi.python.org and install it on
> the Spark cluster using pip. Transitive dependencies will be automatically
> retrieved.
> ---
> {code}
> %AddPythonDist
> Usage: %AddPythonDist<url to dist>
> {code}
> Download the distribution package (source distribution package or binary
> distribution package or wheel) and install it on the master and workers using
> pip.
> ---
> Example:
> One would be able to specify something like
> {code}
> %AddPythonDeps numpy 1.1.1
> %AddPythonDeps requests 0.1.1
> {code}
> I am working to bring pip and virtualenv support to Spark since we need to
> work on clean virtualenv. Here is a [proposal on this
> subject|http://www.great-a-blog.co/wheel-deployment-for-pyspark/] and the
> [pull request|https://github.com/apache/spark/pull/14180].
> Let me know if it would make sense within Apache Toree to have such feature
> (I think it's cool to let each job describes its python dependencies and have
> Spark automatically handle them properly!)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)