Github user nchammas commented on the pull request:
https://github.com/apache/spark/pull/8318#issuecomment-210265477
@jhlch - I think this will be a tough feature to get in, honestly, but if
you want to take a fresh stab at it then I'm interested in helping with review
and testing.
First, I think for the record we need someone to lay out the proposed
benefits clearly since the JIRA for this PR,
[SPARK-1267](https://issues.apache.org/jira/browse/SPARK-1267), doesn't do
that, and committers will want to weigh the proposed benefits against any
ongoing maintenance cost they're going to be asked to bear.
> I'm intending to package the spark jar as part of the module. This will
mean that there is an order of operations for the deployment to PyPi to be
successful.
This sounds good to me.
> Making this change is going to create a new step in the build and
publishing process. Who is responsible for that for Spark?
This, I believe, will be the toughest part of getting a feature like this
in: Committer buy-in.
Packaging Spark for PyPI means committers have extra work to do for every
release, and more things that could go wrong that they have to worry about. Any
Python packaging proposal will have to add little to no committer overhead
(like requiring them to update version strings in more places), and perhaps
include some tests to guarantee that the packaging won't silently break.
Next, we will have to figure out the details of who will own the PyPI
account and coordinate with the ASF if they need to be in the picture. We will
also likely need to reach out to the PyPI admins for a special limit increase
on the size of the package we will be allowed to upload, or instrument some
machinery to get the pip installation to automatically download large artifacts
from somewhere else.
As for who the relevant committers might be, I think they would be @davies
and @JoshRosen for Python, and @rxin and @srowen for packaging: Hey committers,
are there any circumstances under which Python-specific packaging could become
part of the regular Spark release process? If so, are there any prerequisites
we haven't brought up here that you want to see met? I'm just trying gauge
whether this has a realistic chance of ever making it in, or whether we just
don't want to do this.
> Is Spark supported/expected to work on Windows?
[Yes](https://github.com/apache/spark/search?q=windows&type=Issues&utf8=%E2%9C%93),
Spark is supported on Windows. (Though now that you mention it, this isn't
spelled out clearly anywhere in the official docs.)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]