[
https://issues.apache.org/jira/browse/TOREE-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15552894#comment-15552894
]
Gino Bustelo commented on TOREE-337:
------------------------------------
Makes total sense. PRs are welcomes.
> %AddPythonDeps magic to install packages from Pypi
> --------------------------------------------------
>
> Key: TOREE-337
> URL: https://issues.apache.org/jira/browse/TOREE-337
> Project: TOREE
> Issue Type: Improvement
> Reporter: Semet
> Labels: python,, python-wheel
>
> I would like to volunteer to work on add two "magic" to Toree, related to
> Python dependency managements:
> {code}
> %AddPythonDeps
> Usage: %AddPythonDeps pypi_package_name [version]
> {code}
> Download and install Python dependency from pypi.python.org and install it on
> the Spark cluster using pip. Transitive dependencies will be automatically
> retrieved.
> ---
> {code}
> %AddPythonDist
> Usage: %AddPythonDist<url to dist>
> {code}
> Download the distribution package (source distribution package or binary
> distribution package or wheel) and install it on the master and workers using
> pip.
> ---
> Example:
> One would be able to specify something like
> {code}
> %AddPythonDeps numpy 1.1.1
> %AddPythonDeps requests 0.1.1
> {code}
> I am working to bring pip and virtualenv support to Spark since we need to
> work on clean virtualenv. Here is a [proposal on this
> subject|http://www.great-a-blog.co/wheel-deployment-for-pyspark/] and the
> [pull request|https://github.com/apache/spark/pull/14180].
> Let me know if it would make sense within Apache Toree to have such feature
> (I think it's cool to let each job describes its python dependencies and have
> Spark automatically handle them properly!)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)