[ 
https://issues.apache.org/jira/browse/TOREE-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15552894#comment-15552894
 ] 

Gino Bustelo commented on TOREE-337:
------------------------------------

Makes total sense. PRs are welcomes.

> %AddPythonDeps magic to install packages from Pypi
> --------------------------------------------------
>
>                 Key: TOREE-337
>                 URL: https://issues.apache.org/jira/browse/TOREE-337
>             Project: TOREE
>          Issue Type: Improvement
>            Reporter: Semet
>              Labels: python,, python-wheel
>
> I would like to volunteer to work on add two "magic" to Toree, related to 
> Python dependency managements:
> {code}
> %AddPythonDeps
> Usage: %AddPythonDeps pypi_package_name [version]
> {code}
> Download and install Python dependency from pypi.python.org and install it on 
> the Spark cluster using pip. Transitive dependencies will be automatically 
> retrieved.
> --- 
> {code}
> %AddPythonDist
> Usage: %AddPythonDist<url to dist>
> {code}
> Download the distribution package (source distribution package or binary 
> distribution package or wheel) and install it on the master and workers using 
> pip.
> ---
> Example:
> One would be able to specify something like
> {code}
> %AddPythonDeps numpy 1.1.1
> %AddPythonDeps requests 0.1.1
> {code}
> I am working to bring pip and virtualenv support to Spark since we need to 
> work on clean virtualenv. Here is a [proposal on this 
> subject|http://www.great-a-blog.co/wheel-deployment-for-pyspark/] and the 
> [pull request|https://github.com/apache/spark/pull/14180].
> Let me know if it would make sense within Apache Toree to have such feature 
> (I think it's cool to let each job describes its python dependencies and have 
> Spark automatically handle them properly!)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to