[GitHub] spark pull request: [SPARK-1267][PYSPARK] Adds pip installer for p...

nchammas Thu, 14 Apr 2016 20:07:41 -0700

Github user nchammas commented on the pull request:

    https://github.com/apache/spark/pull/8318#issuecomment-210265477
  
    @jhlch - I think this will be a tough feature to get in, honestly, but if 
you want to take a fresh stab at it then I'm interested in helping with review 
and testing.
    
    First, I think for the record we need someone to lay out the proposed 
benefits clearly since the JIRA for this PR, 
[SPARK-1267](https://issues.apache.org/jira/browse/SPARK-1267), doesn't do 
that, and committers will want to weigh the proposed benefits against any 
ongoing maintenance cost they're going to be asked to bear.
    
    > I'm intending to package the spark jar as part of the module. This will
    mean that there is an order of operations for the deployment to PyPi to be
    successful.
    
    This sounds good to me.
    
    > Making this change is going to create a new step in the build and
    publishing process. Who is responsible for that for Spark?
    
    This, I believe, will be the toughest part of getting a feature like this 
in: Committer buy-in.
    
    Packaging Spark for PyPI means committers have extra work to do for every 
release, and more things that could go wrong that they have to worry about. Any 
Python packaging proposal will have to add little to no committer overhead 
(like requiring them to update version strings in more places), and perhaps 
include some tests to guarantee that the packaging won't silently break.
    
    Next, we will have to figure out the details of who will own the PyPI 
account and coordinate with the ASF if they need to be in the picture. We will 
also likely need to reach out to the PyPI admins for a special limit increase 
on the size of the package we will be allowed to upload, or instrument some 
machinery to get the pip installation to automatically download large artifacts 
from somewhere else.
    
    As for who the relevant committers might be, I think they would be @davies 
and @JoshRosen for Python, and @rxin and @srowen for packaging: Hey committers, 
are there any circumstances under which Python-specific packaging could become 
part of the regular Spark release process? If so, are there any prerequisites 
we haven't brought up here that you want to see met? I'm just trying gauge 
whether this has a realistic chance of ever making it in, or whether we just 
don't want to do this.
    
    > Is Spark supported/expected to work on Windows?
    
    
[Yes](https://github.com/apache/spark/search?q=windows&type=Issues&utf8=%E2%9C%93),
 Spark is supported on Windows. (Though now that you mention it, this isn't 
spelled out clearly anywhere in the official docs.)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-1267][PYSPARK] Adds pip installer for p...

Reply via email to