[ https://issues.apache.org/jira/browse/TOREE-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15552894#comment-15552894 ]
Gino Bustelo commented on TOREE-337: ------------------------------------ Makes total sense. PRs are welcomes. > %AddPythonDeps magic to install packages from Pypi > -------------------------------------------------- > > Key: TOREE-337 > URL: https://issues.apache.org/jira/browse/TOREE-337 > Project: TOREE > Issue Type: Improvement > Reporter: Semet > Labels: python,, python-wheel > > I would like to volunteer to work on add two "magic" to Toree, related to > Python dependency managements: > {code} > %AddPythonDeps > Usage: %AddPythonDeps pypi_package_name [version] > {code} > Download and install Python dependency from pypi.python.org and install it on > the Spark cluster using pip. Transitive dependencies will be automatically > retrieved. > --- > {code} > %AddPythonDist > Usage: %AddPythonDist<url to dist> > {code} > Download the distribution package (source distribution package or binary > distribution package or wheel) and install it on the master and workers using > pip. > --- > Example: > One would be able to specify something like > {code} > %AddPythonDeps numpy 1.1.1 > %AddPythonDeps requests 0.1.1 > {code} > I am working to bring pip and virtualenv support to Spark since we need to > work on clean virtualenv. Here is a [proposal on this > subject|http://www.great-a-blog.co/wheel-deployment-for-pyspark/] and the > [pull request|https://github.com/apache/spark/pull/14180]. > Let me know if it would make sense within Apache Toree to have such feature > (I think it's cool to let each job describes its python dependencies and have > Spark automatically handle them properly!) -- This message was sent by Atlassian JIRA (v6.3.4#6332)