Re: [RESULT] [VOTE] Release Apache Spark 1.1.0 (RC4)
Hey just a heads up to everyone - running a bit behind on getting the final artifacts and notes up. Finalizing this release was much more complicated than previous ones due to new binary formats (we need to redesign the download page a bit for this to work) and the large increase in contributor count. Next time we can pipeline this work to avoid a delay. I did cut the v1.1.0 tag today. We should be able to do the full announce tomorrow. Thanks, Patrick On Sun, Sep 7, 2014 at 5:50 PM, Patrick Wendell pwend...@gmail.com wrote: This vote passes with 8 binding +1 votes and no -1 votes. I'll post the final release in the next 48 hours... just finishing the release notes and packaging (which now takes a long time given the number of contributors!). +1: Reynold Xin* Michael Armbrust* Xiangrui Meng* Andrew Or* Sean Owen Matthew Farrellee Marcelo Vanzin Josh Rosen* Cheng Lian Mubarak Seyed Matei Zaharia* Nan Zhu Jeremy Freeman Denny Lee Tom Graves* Henry Saputra Egor Pahomov Rohit Sinha Kan Zhang Tathagata Das* Reza Zadeh -1: 0: * = binding - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 Tested streaming integration with flume on a local test bed. On Thu, Sep 4, 2014 at 6:08 PM, Kan Zhang kzh...@apache.org wrote: +1 Compiled, ran newly-introduced PySpark Hadoop input/output examples. On Thu, Sep 4, 2014 at 1:10 PM, Egor Pahomov pahomov.e...@gmail.com wrote: +1 Compiled, ran on yarn-hadoop-2.3 simple job. 2014-09-04 22:22 GMT+04:00 Henry Saputra henry.sapu...@gmail.com: LICENSE and NOTICE files are good Hash files are good Signature files are good No 3rd parties executables Source compiled Run local and standalone tests Test persist off heap with Tachyon looks good +1 - Henry On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org -- *Sincerely yoursEgor PakhomovScala Developer, Yandex*
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 Tested recently merged mllib matrix multiplication bugfix https://github.com/apache/spark/pull/2224 On Sat, Sep 6, 2014 at 2:35 PM, Tathagata Das tathagata.das1...@gmail.com wrote: +1 Tested streaming integration with flume on a local test bed. On Thu, Sep 4, 2014 at 6:08 PM, Kan Zhang kzh...@apache.org wrote: +1 Compiled, ran newly-introduced PySpark Hadoop input/output examples. On Thu, Sep 4, 2014 at 1:10 PM, Egor Pahomov pahomov.e...@gmail.com wrote: +1 Compiled, ran on yarn-hadoop-2.3 simple job. 2014-09-04 22:22 GMT+04:00 Henry Saputra henry.sapu...@gmail.com: LICENSE and NOTICE files are good Hash files are good Signature files are good No 3rd parties executables Source compiled Run local and standalone tests Test persist off heap with Tachyon looks good +1 - Henry On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org -- *Sincerely yoursEgor PakhomovScala Developer, Yandex*
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1. Ran spark on yarn on hadoop 0.23 and 2.x. Tom On Wednesday, September 3, 2014 2:25 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
On 09/03/2014 04:23 PM, Nicholas Chammas wrote: On Wed, Sep 3, 2014 at 3:24 AM, Patrick Wendell pwend...@gmail.com wrote: == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. Will these changes be called out in the release notes or somewhere in the docs? That last one (which I believe is what we discovered as the result of SPARK- https://issues.apache.org/jira/browse/SPARK-) could have a large impact on PySpark users. Just wanted to add, it might be related to this issue or different. There is a regression when using pyspark to read data from HDFS. its performance during map tasks has gone down approx 1 - 0.5x. I have tested the 1.0.2 and the performance was fine, but the 1.1 release candidate has this issue. I tested by setting the following properties to make sure it was not due to these. set(spark.io.compression.codec,lzf).set(spark.shuffle.spill,false) in conf object. Regards, Gurvinder Nick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
LICENSE and NOTICE files are good Hash files are good Signature files are good No 3rd parties executables Source compiled Run local and standalone tests Test persist off heap with Tachyon looks good +1 - Henry On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 Compiled, ran on yarn-hadoop-2.3 simple job. 2014-09-04 22:22 GMT+04:00 Henry Saputra henry.sapu...@gmail.com: LICENSE and NOTICE files are good Hash files are good Signature files are good No 3rd parties executables Source compiled Run local and standalone tests Test persist off heap with Tachyon looks good +1 - Henry On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org -- *Sincerely yoursEgor PakhomovScala Developer, Yandex*
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
On Thu, Sep 4, 2014 at 1:50 PM, Gurvinder Singh gurvinder.si...@uninett.no wrote: There is a regression when using pyspark to read data from HDFS. Could you open a JIRA http://issues.apache.org/jira/ with a brief repro? We'll look into it. (You could also provide a repro in a separate thread.) Nick
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-1-0-RC4-tp8219p8278.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 Compiled, ran newly-introduced PySpark Hadoop input/output examples. On Thu, Sep 4, 2014 at 1:10 PM, Egor Pahomov pahomov.e...@gmail.com wrote: +1 Compiled, ran on yarn-hadoop-2.3 simple job. 2014-09-04 22:22 GMT+04:00 Henry Saputra henry.sapu...@gmail.com: LICENSE and NOTICE files are good Hash files are good Signature files are good No 3rd parties executables Source compiled Run local and standalone tests Test persist off heap with Tachyon looks good +1 - Henry On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org -- *Sincerely yoursEgor PakhomovScala Developer, Yandex*
[VOTE] Release Apache Spark 1.1.0 (RC4)
Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
I'll kick it off with a +1 On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 Tested locally on Mac OS X with local-cluster mode. On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: I'll kick it off with a +1 On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 On Wed, Sep 3, 2014 at 12:29 AM, Reynold Xin r...@databricks.com wrote: +1 Tested locally on Mac OS X with local-cluster mode. On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: I'll kick it off with a +1 On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1. Tested some MLlib example code. For default changes, maybe it is useful to mention the default broadcast factory changed to torrent. On Wed, Sep 3, 2014 at 12:34 AM, Michael Armbrust mich...@databricks.com wrote: +1 On Wed, Sep 3, 2014 at 12:29 AM, Reynold Xin r...@databricks.com wrote: +1 Tested locally on Mac OS X with local-cluster mode. On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: I'll kick it off with a +1 On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 Tested on Yarn and Windows. Also verified that standalone cluster mode is now fixed. 2014-09-03 1:25 GMT-07:00 Xiangrui Meng men...@gmail.com: +1. Tested some MLlib example code. For default changes, maybe it is useful to mention the default broadcast factory changed to torrent. On Wed, Sep 3, 2014 at 12:34 AM, Michael Armbrust mich...@databricks.com wrote: +1 On Wed, Sep 3, 2014 at 12:29 AM, Reynold Xin r...@databricks.com wrote: +1 Tested locally on Mac OS X with local-cluster mode. On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: I'll kick it off with a +1 On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 signatures still fine, tests still pass. On Mac OS X I get the following failure but I think it's spurious. Only mentioning it to see if anyone else sees it. It doesn't happen on Linux. [error] Test org.apache.spark.streaming.kafka.JavaKafkaStreamSuite.testKafkaStream failed: junit.framework.AssertionFailedError: expected:3 but was:0 [error] at junit.framework.Assert.fail(Assert.java:50) [error] at junit.framework.Assert.failNotEquals(Assert.java:287) [error] at junit.framework.Assert.assertEquals(Assert.java:67) [error] at junit.framework.Assert.assertEquals(Assert.java:199) [error] at junit.framework.Assert.assertEquals(Assert.java:205) [error] at org.apache.spark.streaming.kafka.JavaKafkaStreamSuite.testKafkaStream(JavaKafkaStreamSuite.java:129) [error] On Wed, Sep 3, 2014 at 8:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
On Wed, Sep 3, 2014 at 3:24 AM, Patrick Wendell pwend...@gmail.com wrote: == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. Will these changes be called out in the release notes or somewhere in the docs? That last one (which I believe is what we discovered as the result of SPARK- https://issues.apache.org/jira/browse/SPARK-) could have a large impact on PySpark users. Nick
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 built from sha w/ make-distribution.sh tested basic examples (0 data) w/ local on fedora 20 (openjdk 1.7, python 2.7.5) tested detection and log processing (25GB data) w/ mesos (0.19.0) nfs on rhel 7 (openjdk 1.7, python 2.7.5) On 09/03/2014 03:24 AM, Patrick Wendell wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
Hey Nick, Yeah we'll put those in the release notes. On Wed, Sep 3, 2014 at 7:23 AM, Nicholas Chammas nicholas.cham...@gmail.com wrote: On Wed, Sep 3, 2014 at 3:24 AM, Patrick Wendell pwend...@gmail.com wrote: == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. Will these changes be called out in the release notes or somewhere in the docs? That last one (which I believe is what we discovered as the result of SPARK-) could have a large impact on PySpark users. Nick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 (non-binding) - checked checksums of a few packages - ran few jobs against yarn client/cluster using hadoop2.3 package - played with spark-shell in yarn-client mode On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org -- Marcelo - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1. Tested on Windows and EC2. Confirmed that the EC2 pvm-hvm switch fixed the SPARK-3358 regression. On September 3, 2014 at 10:33:45 AM, Marcelo Vanzin (van...@cloudera.com) wrote: +1 (non-binding) - checked checksums of a few packages - ran few jobs against yarn client/cluster using hadoop2.3 package - played with spark-shell in yarn-client mode On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org -- Marcelo - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1. Tested locally on OSX 10.9, built with Hadoop 2.4.1 - Checked Datanucleus jar files - Tested Spark SQL Thrift server and CLI under local mode and standalone cluster against MySQL backed metastore On Wed, Sep 3, 2014 at 11:25 AM, Josh Rosen rosenvi...@gmail.com wrote: +1. Tested on Windows and EC2. Confirmed that the EC2 pvm-hvm switch fixed the SPARK-3358 regression. On September 3, 2014 at 10:33:45 AM, Marcelo Vanzin (van...@cloudera.com) wrote: +1 (non-binding) - checked checksums of a few packages - ran few jobs against yarn client/cluster using hadoop2.3 package - played with spark-shell in yarn-client mode On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org -- Marcelo - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 (non-binding) Tested locally on Mac OS X with local-cluster mode. On Wed, Sep 3, 2014 at 12:23 PM, Cheng Lian lian.cs@gmail.com wrote: +1. Tested locally on OSX 10.9, built with Hadoop 2.4.1 - Checked Datanucleus jar files - Tested Spark SQL Thrift server and CLI under local mode and standalone cluster against MySQL backed metastore On Wed, Sep 3, 2014 at 11:25 AM, Josh Rosen rosenvi...@gmail.com wrote: +1. Tested on Windows and EC2. Confirmed that the EC2 pvm-hvm switch fixed the SPARK-3358 regression. On September 3, 2014 at 10:33:45 AM, Marcelo Vanzin (van...@cloudera.com ) wrote: +1 (non-binding) - checked checksums of a few packages - ran few jobs against yarn client/cluster using hadoop2.3 package - played with spark-shell in yarn-client mode On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org -- Marcelo - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 tested thrift server with our in-house application, everything works fine -- Nan Zhu On Wednesday, September 3, 2014 at 4:43 PM, Matei Zaharia wrote: +1 Matei On September 3, 2014 at 12:24:32 PM, Cheng Lian (lian.cs@gmail.com (mailto:lian.cs@gmail.com)) wrote: +1. Tested locally on OSX 10.9, built with Hadoop 2.4.1 - Checked Datanucleus jar files - Tested Spark SQL Thrift server and CLI under local mode and standalone cluster against MySQL backed metastore On Wed, Sep 3, 2014 at 11:25 AM, Josh Rosen rosenvi...@gmail.com (mailto:rosenvi...@gmail.com) wrote: +1. Tested on Windows and EC2. Confirmed that the EC2 pvm-hvm switch fixed the SPARK-3358 regression. On September 3, 2014 at 10:33:45 AM, Marcelo Vanzin (van...@cloudera.com (mailto:van...@cloudera.com)) wrote: +1 (non-binding) - checked checksums of a few packages - ran few jobs against yarn client/cluster using hadoop2.3 package - played with spark-shell in yarn-client mode On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com (mailto:pwend...@gmail.com) wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460 The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1031/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ Please vote on releasing this package as Apache Spark 1.1.0! The vote is open until Saturday, September 06, at 08:30 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == Regressions fixed since RC3 == SPARK-3332 - Issue with tagging in EC2 scripts SPARK-3358 - Issue with regression for m3.XX instances == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.0.X will not block this release. == What default changes should I be aware of? == 1. The default value of spark.io.compression.codec is now snappy -- Old behavior can be restored by switching to lzf 2. PySpark now performs external spilling during aggregations. -- Old behavior can be restored by setting spark.shuffle.spill to false. 3. PySpark uses a new heuristic for determining the parallelism of shuffle operations. -- Old behavior can be restored by setting spark.default.parallelism to the number of cores in the cluster. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org (mailto:dev-unsubscr...@spark.apache.org) For additional commands, e-mail: dev-h...@spark.apache.org (mailto:dev-h...@spark.apache.org) -- Marcelo - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org (mailto:dev-unsubscr...@spark.apache.org) For additional commands, e-mail: dev-h...@spark.apache.org (mailto:dev-h...@spark.apache.org)
Re: [VOTE] Release Apache Spark 1.1.0 (RC4)
+1 on OSX Yosemite, built with Hadoop 2.4.1, Hive 0.12 testing SparkSQL, Thrift, MySQL metastore On Wed, Sep 3, 2014 at 4:02 PM, Jeremy Freeman freeman.jer...@gmail.com wrote: +1 -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-1-0-RC4-tp8219p8254.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org