[
https://issues.apache.org/jira/browse/SPARK-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660553#comment-14660553
]
Ryan Williams commented on SPARK-1517:
--------------------------------------
h3. Maven snapshots
I hear your point that idiomatic Maven-snapshot workflows are not well suited
to this task. Something I've been doing instead is running commands like this
from within a Spark repo:
{code}
$ sha=$(git --no-pager log --no-walk --format="%h" HEAD)
$ mvn versions:set -DgenerateBackupPoms=false -DnewVersion=$sha
$ mvn install -DskipTests
{code}
This renames the version in all POMs to the abbreviated SHA of {{HEAD}}, builds
Spark, and installs the SHA-namespaced artifacts in my local Maven cache, at
e.g. {{~/.m2/repository/org/apache/spark/spark-core_2.10/901dbd0}}.
Then I just put {{901dbd0}} as the version in some other project and, voila, I
can link against arbitrary Spark SHAs, have many co-exist in my local Maven
cache without them all being named {{1.x.y-SNAPSHOT}}, etc. [Here's an
example|https://github.com/hammerlab/pageant/blob/56bff88f426dd69083424a91cc35099a2a157f10/pom.xml#L30]
where I needed a patched Spark before {{1.4.1}} was released with the fix I
needed.
Could any existing continuous build infrastructure be modified to run the {{mvn
versions:set}} command above and publish artifacts to some Maven repository,
ID'd by SHA?
h3. Binaries
It also makes sense that your ASF user account will not scale for this purpose
:) OTOH, it should be possible to store these cheaply somewhere.
{{spark-1.4.1-bin-hadoop2.4.tgz}} is ~234MB and there are ~4000 SHAs from 1.2.0
to 1.5.0, so hosting every single SHA in that range would be a few TB, afaict.
Analogous to my previous question: could any existing continuous build
infrastructure be modified to run the {{mvn versions:set}} command above and
send upload binaries somewhere that could hold more than just the last few?
These binaries are apparently already being generated, and mostly deleted in
~24hrs as your ASF userdir runs out of space?
> Publish nightly snapshots of documentation, maven artifacts, and binary builds
> ------------------------------------------------------------------------------
>
> Key: SPARK-1517
> URL: https://issues.apache.org/jira/browse/SPARK-1517
> Project: Spark
> Issue Type: Improvement
> Components: Build, Project Infra
> Reporter: Patrick Wendell
> Assignee: Patrick Wendell
> Priority: Critical
>
> Should be pretty easy to do with Jenkins. The only thing I can think of that
> would be tricky is to set up credentials so that jenkins can publish this
> stuff somewhere on apache infra.
> Ideally we don't want to have to put a private key on every jenkins box
> (since they are otherwise pretty stateless). One idea is to encrypt these
> credentials with a passphrase and post them somewhere publicly visible. Then
> the jenkins build can download the credentials provided we set a passphrase
> in an environment variable in jenkins. There may be simpler solutions as well.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]