date:20140829

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Sean Owen

+1 I tested the source and Hadoop 2.4 release. Checksums and
signatures are OK. Compiles fine with Java 8 on OS X. Tests... don't
fail any more than usual.

FWIW I've also been using the 1.1.0-SNAPSHOT for some time in another
project and have encountered no problems.

I notice that the 1.1.0 release removes the CDH4-specific build, but
adds two MapR-specific builds. Compare with
https://dist.apache.org/repos/dist/release/spark/spark-1.0.2/ I
commented on the commit:
https://github.com/apache/spark/commit/ceb19830b88486faa87ff41e18d03ede713a73cc

I'm in favor of removing all vendor-specific builds. This change
*looks* a bit funny as there was no JIRA (?) and appears to swap one
vendor for another. Of course there's nothing untoward going on, but
what was the reasoning? It's best avoided, and MapR already
distributes Spark just fine, no?

This is a gray area with ASF projects. I mention it as well because it
came up with Apache Flink recently
(http://mail-archives.eu.apache.org/mod_mbox/incubator-flink-dev/201408.mbox/%3CCANC1h_u%3DN0YKFu3pDaEVYz5ZcQtjQnXEjQA2ReKmoS%2Bye7%3Do%3DA%40mail.gmail.com%3E)
Another vendor rightly noted this could look like favoritism. They
changed to remove vendor releases.

On Fri, Aug 29, 2014 at 3:14 AM, Patrick Wendell pwend...@gmail.com wrote:
Please vote on releasing the following candidate as Apache Spark version
1.1.0!

The tag to be voted on is v1.1.0-rc2 (commit 711aebb3):
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=711aebb329ca28046396af1e34395a0df92b5327

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc2/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1029/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc2-docs/

Please vote on releasing this package as Apache Spark 1.1.0!

The vote is open until Monday, September 01, at 03:11 UTC and passes if
a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== Regressions fixed since RC1 ==
LZ4 compression issue: https://issues.apache.org/jira/browse/SPARK-3277

== What justifies a -1 vote for this release? ==
This vote is happening very late into the QA period compared with
previous votes, so -1 votes should only occur for significant
regressions from 1.0.2. Bugs already present in 1.0.X will not block
this release.

== What default changes should I be aware of? ==
1. The default value of spark.io.compression.codec is now snappy
-- Old behavior can be restored by switching to lzf

2. PySpark now performs external spilling during aggregations.
-- Old behavior can be restored by setting spark.shuffle.spill to false.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Patrick Wendell

Hey Sean,

The reason there are no longer CDH-specific builds is that all newer
versions of CDH and HDP work with builds for the upstream Hadoop
projects. I dropped CDH4 in favor of a newer Hadoop version (2.4) and
the Hadoop-without-Hive (also 2.4) build.

For MapR - we can't officially post those artifacts on ASF web space
when we make the final release, we can only link to them as being
hosted by MapR specifically since they use non-compatible licenses.
However, I felt that providing these during a testing period was
alright, with the goal of increasing test coverage. I couldn't find
any policy against posting these on personal web space during RC
voting. However, we can remove them if there is one.

Dropping CDH4 was more because it is now pretty old, but we can add it
back if people want. The binary packaging is a slightly separate
question from release votes, so I can always add more binary packages
whenever. And on this, my main concern is covering the most popular
Hadoop versions to lower the bar for users to build and test Spark.

- Patrick

On Thu, Aug 28, 2014 at 11:04 PM, Sean Owen so...@cloudera.com wrote:
+1 I tested the source and Hadoop 2.4 release. Checksums and
signatures are OK. Compiles fine with Java 8 on OS X. Tests... don't
fail any more than usual.