subject:"\[VOTE\] Release Apache Spark 1.1.0 \(RC4\)"

Re: [RESULT] [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-11 Thread Patrick Wendell

Hey just a heads up to everyone - running a bit behind on getting the
final artifacts and notes up. Finalizing this release was much more
complicated than previous ones due to new binary formats (we need to
redesign the download page a bit for this to work) and the large
increase in contributor count. Next time we can pipeline this work to
avoid a delay.

I did cut the v1.1.0 tag today. We should be able to do the full
announce tomorrow.

Thanks,
Patrick

On Sun, Sep 7, 2014 at 5:50 PM, Patrick Wendell pwend...@gmail.com wrote:
 This vote passes with 8 binding +1 votes and no -1 votes. I'll post
 the final release in the next 48 hours... just finishing the release
 notes and packaging (which now takes a long time given the number of
 contributors!).

 +1:
 Reynold Xin*
 Michael Armbrust*
 Xiangrui Meng*
 Andrew Or*
 Sean Owen
 Matthew Farrellee
 Marcelo Vanzin
 Josh Rosen*
 Cheng Lian
 Mubarak Seyed
 Matei Zaharia*
 Nan Zhu
 Jeremy Freeman
 Denny Lee
 Tom Graves*
 Henry Saputra
 Egor Pahomov
 Rohit Sinha
 Kan Zhang
 Tathagata Das*
 Reza Zadeh

 -1:

 0:

 * = binding

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-06 Thread Tathagata Das

Tested streaming integration with flume on a local test bed.

On Thu, Sep 4, 2014 at 6:08 PM, Kan Zhang kzh...@apache.org wrote:

Compiled, ran newly-introduced PySpark Hadoop input/output examples.

On Thu, Sep 4, 2014 at 1:10 PM, Egor Pahomov pahomov.e...@gmail.com
wrote:

Compiled, ran on yarn-hadoop-2.3 simple job.

2014-09-04 22:22 GMT+04:00 Henry Saputra henry.sapu...@gmail.com:

LICENSE and NOTICE files are good
Hash files are good
Signature files are good
No 3rd parties executables
Source compiled
Run local and standalone tests
Test persist off heap with Tachyon looks good

- Henry

On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com
wrote:
Please vote on releasing the following candidate as Apache Spark
version
1.1.0!

The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):

https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460

The release files, including signatures, digests, etc. can be found
at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:

https://repository.apache.org/content/repositories/orgapachespark-1031/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/

Please vote on releasing this package as Apache Spark 1.1.0!

The vote is open until Saturday, September 06, at 08:30 UTC and
passes
if
a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== Regressions fixed since RC3 ==
SPARK-3332 - Issue with tagging in EC2 scripts
SPARK-3358 - Issue with regression for m3.XX instances

== What justifies a -1 vote for this release? ==
This vote is happening very late into the QA period compared with
previous votes, so -1 votes should only occur for significant
regressions from 1.0.2. Bugs already present in 1.0.X will not block
this release.

== What default changes should I be aware of? ==
1. The default value of spark.io.compression.codec is now snappy
-- Old behavior can be restored by switching to lzf

2. PySpark now performs external spilling during aggregations.
-- Old behavior can be restored by setting spark.shuffle.spill to
false.

3. PySpark uses a new heuristic for determining the parallelism of
shuffle operations.
-- Old behavior can be restored by setting
spark.default.parallelism to the number of cores in the cluster.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

*Sincerely yoursEgor PakhomovScala Developer, Yandex*

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-06 Thread Reza Zadeh

+1
Tested recently merged mllib matrix multiplication bugfix
https://github.com/apache/spark/pull/2224

On Sat, Sep 6, 2014 at 2:35 PM, Tathagata Das tathagata.das1...@gmail.com
wrote:

Tested streaming integration with flume on a local test bed.

On Thu, Sep 4, 2014 at 6:08 PM, Kan Zhang kzh...@apache.org wrote:

Compiled, ran newly-introduced PySpark Hadoop input/output examples.

On Thu, Sep 4, 2014 at 1:10 PM, Egor Pahomov pahomov.e...@gmail.com
wrote:

Compiled, ran on yarn-hadoop-2.3 simple job.

2014-09-04 22:22 GMT+04:00 Henry Saputra henry.sapu...@gmail.com:

LICENSE and NOTICE files are good
Hash files are good
Signature files are good
No 3rd parties executables
Source compiled
Run local and standalone tests
Test persist off heap with Tachyon looks good

- Henry

On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com

wrote:
Please vote on releasing the following candidate as Apache Spark
version
1.1.0!

The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):

https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460

The release files, including signatures, digests, etc. can be found
at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:

https://repository.apache.org/content/repositories/orgapachespark-1031/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/

Please vote on releasing this package as Apache Spark 1.1.0!

The vote is open until Saturday, September 06, at 08:30 UTC and
passes
if
a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== Regressions fixed since RC3 ==
SPARK-3332 - Issue with tagging in EC2 scripts
SPARK-3358 - Issue with regression for m3.XX instances

== What default changes should I be aware of? ==
1. The default value of spark.io.compression.codec is now
snappy
-- Old behavior can be restored by switching to lzf

2. PySpark now performs external spilling during aggregations.
-- Old behavior can be restored by setting spark.shuffle.spill
to
false.

3. PySpark uses a new heuristic for determining the parallelism of
shuffle operations.
-- Old behavior can be restored by setting
spark.default.parallelism to the number of cores in the cluster.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

*Sincerely yoursEgor PakhomovScala Developer, Yandex*

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Tom Graves

+1. Ran spark on yarn on hadoop 0.23 and 2.x.

Tom


On Wednesday, September 3, 2014 2:25 AM, Patrick Wendell pwend...@gmail.com 
wrote:
 


Please vote on releasing the following candidate as Apache Spark version 1.1.0!

The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1031/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/

Please vote on releasing this package as Apache Spark 1.1.0!

The vote is open until Saturday, September 06, at 08:30 UTC and passes if
a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== Regressions fixed since RC3 ==
SPARK-3332 - Issue with tagging in EC2 scripts
SPARK-3358 - Issue with regression for m3.XX instances

== What justifies a -1 vote for this release? ==
This vote is happening very late into the QA period compared with
previous votes, so -1 votes should only occur for significant
regressions from 1.0.2. Bugs already present in 1.0.X will not block
this release.

== What
 default changes should I be aware of? ==
1. The default value of spark.io.compression.codec is now snappy
-- Old behavior can be restored by switching to lzf

2. PySpark now performs external spilling during aggregations.
-- Old behavior can be restored by setting spark.shuffle.spill to false.

3. PySpark uses a new heuristic for determining the parallelism of
shuffle operations.
-- Old behavior can be restored by setting
spark.default.parallelism to the number of cores in the cluster.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Gurvinder Singh

On 09/03/2014 04:23 PM, Nicholas Chammas wrote:
 On Wed, Sep 3, 2014 at 3:24 AM, Patrick Wendell pwend...@gmail.com wrote:
 
 == What default changes should I be aware of? ==
 1. The default value of spark.io.compression.codec is now snappy
 -- Old behavior can be restored by switching to lzf

 2. PySpark now performs external spilling during aggregations.
 -- Old behavior can be restored by setting spark.shuffle.spill to
 false.

 3. PySpark uses a new heuristic for determining the parallelism of
 shuffle operations.
 -- Old behavior can be restored by setting
 spark.default.parallelism to the number of cores in the cluster.

 
 Will these changes be called out in the release notes or somewhere in the
 docs?
 
 That last one (which I believe is what we discovered as the result of
 SPARK- https://issues.apache.org/jira/browse/SPARK-) could have a
 large impact on PySpark users.

Just wanted to add, it might be related to this issue or different.
There is a regression when using pyspark to read data
from HDFS. its performance during map tasks has gone down approx 1 -
0.5x. I have tested the 1.0.2 and the performance was fine, but the 1.1
release candidate has this issue. I tested by setting the following
properties to make sure it was not due to these.

set(spark.io.compression.codec,lzf).set(spark.shuffle.spill,false)

in conf object.

Regards,
Gurvinder
 
 Nick
 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Henry Saputra

LICENSE and NOTICE files are good
Hash files are good
Signature files are good
No 3rd parties executables
Source compiled
Run local and standalone tests
Test persist off heap with Tachyon looks good

+1

- Henry

On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote:
 Please vote on releasing the following candidate as Apache Spark version 
 1.1.0!

 The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460

 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.1.0-rc4/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1031/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/

 Please vote on releasing this package as Apache Spark 1.1.0!

 The vote is open until Saturday, September 06, at 08:30 UTC and passes if
 a majority of at least 3 +1 PMC votes are cast.

 [ ] +1 Release this package as Apache Spark 1.1.0
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 == Regressions fixed since RC3 ==
 SPARK-3332 - Issue with tagging in EC2 scripts
 SPARK-3358 - Issue with regression for m3.XX instances

 == What justifies a -1 vote for this release? ==
 This vote is happening very late into the QA period compared with
 previous votes, so -1 votes should only occur for significant
 regressions from 1.0.2. Bugs already present in 1.0.X will not block
 this release.

 == What default changes should I be aware of? ==
 1. The default value of spark.io.compression.codec is now snappy
 -- Old behavior can be restored by switching to lzf

 2. PySpark now performs external spilling during aggregations.
 -- Old behavior can be restored by setting spark.shuffle.spill to false.

 3. PySpark uses a new heuristic for determining the parallelism of
 shuffle operations.
 -- Old behavior can be restored by setting
 spark.default.parallelism to the number of cores in the cluster.

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Egor Pahomov

+1

Compiled, ran on yarn-hadoop-2.3 simple job.


2014-09-04 22:22 GMT+04:00 Henry Saputra henry.sapu...@gmail.com:

 LICENSE and NOTICE files are good
 Hash files are good
 Signature files are good
 No 3rd parties executables
 Source compiled
 Run local and standalone tests
 Test persist off heap with Tachyon looks good

 +1

 - Henry

 On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com
 wrote:
  Please vote on releasing the following candidate as Apache Spark version
 1.1.0!
 
  The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
 
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
 
  The release files, including signatures, digests, etc. can be found at:
  http://people.apache.org/~pwendell/spark-1.1.0-rc4/
 
  Release artifacts are signed with the following key:
  https://people.apache.org/keys/committer/pwendell.asc
 
  The staging repository for this release can be found at:
  https://repository.apache.org/content/repositories/orgapachespark-1031/
 
  The documentation corresponding to this release can be found at:
  http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
 
  Please vote on releasing this package as Apache Spark 1.1.0!
 
  The vote is open until Saturday, September 06, at 08:30 UTC and passes if
  a majority of at least 3 +1 PMC votes are cast.
 
  [ ] +1 Release this package as Apache Spark 1.1.0
  [ ] -1 Do not release this package because ...
 
  To learn more about Apache Spark, please see
  http://spark.apache.org/
 
  == Regressions fixed since RC3 ==
  SPARK-3332 - Issue with tagging in EC2 scripts
  SPARK-3358 - Issue with regression for m3.XX instances
 
  == What justifies a -1 vote for this release? ==
  This vote is happening very late into the QA period compared with
  previous votes, so -1 votes should only occur for significant
  regressions from 1.0.2. Bugs already present in 1.0.X will not block
  this release.
 
  == What default changes should I be aware of? ==
  1. The default value of spark.io.compression.codec is now snappy
  -- Old behavior can be restored by switching to lzf
 
  2. PySpark now performs external spilling during aggregations.
  -- Old behavior can be restored by setting spark.shuffle.spill to
 false.
 
  3. PySpark uses a new heuristic for determining the parallelism of
  shuffle operations.
  -- Old behavior can be restored by setting
  spark.default.parallelism to the number of cores in the cluster.
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org




-- 



*Sincerely yoursEgor PakhomovScala Developer, Yandex*

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Nicholas Chammas

On Thu, Sep 4, 2014 at 1:50 PM, Gurvinder Singh gurvinder.si...@uninett.no
wrote:

 There is a regression when using pyspark to read data
 from HDFS.


Could you open a JIRA http://issues.apache.org/jira/ with a brief repro?
We'll look into it.

(You could also provide a repro in a separate thread.)

Nick

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread randomuser54

+1



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-1-0-RC4-tp8219p8278.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Kan Zhang

Compiled, ran newly-introduced PySpark Hadoop input/output examples.

On Thu, Sep 4, 2014 at 1:10 PM, Egor Pahomov pahomov.e...@gmail.com wrote:

Compiled, ran on yarn-hadoop-2.3 simple job.

2014-09-04 22:22 GMT+04:00 Henry Saputra henry.sapu...@gmail.com:

LICENSE and NOTICE files are good
Hash files are good
Signature files are good
No 3rd parties executables
Source compiled
Run local and standalone tests
Test persist off heap with Tachyon looks good

- Henry

On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com
wrote:
Please vote on releasing the following candidate as Apache Spark
version
1.1.0!

The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):

https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:

https://repository.apache.org/content/repositories/orgapachespark-1031/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/

Please vote on releasing this package as Apache Spark 1.1.0!

The vote is open until Saturday, September 06, at 08:30 UTC and passes
if
a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== Regressions fixed since RC3 ==
SPARK-3332 - Issue with tagging in EC2 scripts
SPARK-3358 - Issue with regression for m3.XX instances

== What default changes should I be aware of? ==
1. The default value of spark.io.compression.codec is now snappy
-- Old behavior can be restored by switching to lzf

2. PySpark now performs external spilling during aggregations.
-- Old behavior can be restored by setting spark.shuffle.spill to
false.

3. PySpark uses a new heuristic for determining the parallelism of
shuffle operations.
-- Old behavior can be restored by setting
spark.default.parallelism to the number of cores in the cluster.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

*Sincerely yoursEgor PakhomovScala Developer, Yandex*

[VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Patrick Wendell

Please vote on releasing the following candidate as Apache Spark version 1.1.0!

The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1031/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/

Please vote on releasing this package as Apache Spark 1.1.0!

The vote is open until Saturday, September 06, at 08:30 UTC and passes if
a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== Regressions fixed since RC3 ==
SPARK-3332 - Issue with tagging in EC2 scripts
SPARK-3358 - Issue with regression for m3.XX instances

== What justifies a -1 vote for this release? ==
This vote is happening very late into the QA period compared with
previous votes, so -1 votes should only occur for significant
regressions from 1.0.2. Bugs already present in 1.0.X will not block
this release.

== What default changes should I be aware of? ==
1. The default value of spark.io.compression.codec is now snappy
-- Old behavior can be restored by switching to lzf

2. PySpark now performs external spilling during aggregations.
-- Old behavior can be restored by setting spark.shuffle.spill to false.

3. PySpark uses a new heuristic for determining the parallelism of
shuffle operations.
-- Old behavior can be restored by setting
spark.default.parallelism to the number of cores in the cluster.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Patrick Wendell

I'll kick it off with a +1

On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote:
 Please vote on releasing the following candidate as Apache Spark version 
 1.1.0!

 The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460

 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.1.0-rc4/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1031/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/

 Please vote on releasing this package as Apache Spark 1.1.0!

 The vote is open until Saturday, September 06, at 08:30 UTC and passes if
 a majority of at least 3 +1 PMC votes are cast.

 [ ] +1 Release this package as Apache Spark 1.1.0
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 == Regressions fixed since RC3 ==
 SPARK-3332 - Issue with tagging in EC2 scripts
 SPARK-3358 - Issue with regression for m3.XX instances

 == What justifies a -1 vote for this release? ==
 This vote is happening very late into the QA period compared with
 previous votes, so -1 votes should only occur for significant
 regressions from 1.0.2. Bugs already present in 1.0.X will not block
 this release.

 == What default changes should I be aware of? ==
 1. The default value of spark.io.compression.codec is now snappy
 -- Old behavior can be restored by switching to lzf

 2. PySpark now performs external spilling during aggregations.
 -- Old behavior can be restored by setting spark.shuffle.spill to false.

 3. PySpark uses a new heuristic for determining the parallelism of
 shuffle operations.
 -- Old behavior can be restored by setting
 spark.default.parallelism to the number of cores in the cluster.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Reynold Xin

+1

Tested locally on Mac OS X with local-cluster mode.




On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com wrote:

 I'll kick it off with a +1

 On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com
 wrote:
  Please vote on releasing the following candidate as Apache Spark version
 1.1.0!
 
  The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
 
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
 
  The release files, including signatures, digests, etc. can be found at:
  http://people.apache.org/~pwendell/spark-1.1.0-rc4/
 
  Release artifacts are signed with the following key:
  https://people.apache.org/keys/committer/pwendell.asc
 
  The staging repository for this release can be found at:
  https://repository.apache.org/content/repositories/orgapachespark-1031/
 
  The documentation corresponding to this release can be found at:
  http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
 
  Please vote on releasing this package as Apache Spark 1.1.0!
 
  The vote is open until Saturday, September 06, at 08:30 UTC and passes if
  a majority of at least 3 +1 PMC votes are cast.
 
  [ ] +1 Release this package as Apache Spark 1.1.0
  [ ] -1 Do not release this package because ...
 
  To learn more about Apache Spark, please see
  http://spark.apache.org/
 
  == Regressions fixed since RC3 ==
  SPARK-3332 - Issue with tagging in EC2 scripts
  SPARK-3358 - Issue with regression for m3.XX instances
 
  == What justifies a -1 vote for this release? ==
  This vote is happening very late into the QA period compared with
  previous votes, so -1 votes should only occur for significant
  regressions from 1.0.2. Bugs already present in 1.0.X will not block
  this release.
 
  == What default changes should I be aware of? ==
  1. The default value of spark.io.compression.codec is now snappy
  -- Old behavior can be restored by switching to lzf
 
  2. PySpark now performs external spilling during aggregations.
  -- Old behavior can be restored by setting spark.shuffle.spill to
 false.
 
  3. PySpark uses a new heuristic for determining the parallelism of
  shuffle operations.
  -- Old behavior can be restored by setting
  spark.default.parallelism to the number of cores in the cluster.

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Michael Armbrust

+1


On Wed, Sep 3, 2014 at 12:29 AM, Reynold Xin r...@databricks.com wrote:

 +1

 Tested locally on Mac OS X with local-cluster mode.




 On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com
 wrote:

  I'll kick it off with a +1
 
  On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com
  wrote:
   Please vote on releasing the following candidate as Apache Spark
 version
  1.1.0!
  
   The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
  
 
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
  
   The release files, including signatures, digests, etc. can be found at:
   http://people.apache.org/~pwendell/spark-1.1.0-rc4/
  
   Release artifacts are signed with the following key:
   https://people.apache.org/keys/committer/pwendell.asc
  
   The staging repository for this release can be found at:
  
 https://repository.apache.org/content/repositories/orgapachespark-1031/
  
   The documentation corresponding to this release can be found at:
   http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
  
   Please vote on releasing this package as Apache Spark 1.1.0!
  
   The vote is open until Saturday, September 06, at 08:30 UTC and passes
 if
   a majority of at least 3 +1 PMC votes are cast.
  
   [ ] +1 Release this package as Apache Spark 1.1.0
   [ ] -1 Do not release this package because ...
  
   To learn more about Apache Spark, please see
   http://spark.apache.org/
  
   == Regressions fixed since RC3 ==
   SPARK-3332 - Issue with tagging in EC2 scripts
   SPARK-3358 - Issue with regression for m3.XX instances
  
   == What justifies a -1 vote for this release? ==
   This vote is happening very late into the QA period compared with
   previous votes, so -1 votes should only occur for significant
   regressions from 1.0.2. Bugs already present in 1.0.X will not block
   this release.
  
   == What default changes should I be aware of? ==
   1. The default value of spark.io.compression.codec is now snappy
   -- Old behavior can be restored by switching to lzf
  
   2. PySpark now performs external spilling during aggregations.
   -- Old behavior can be restored by setting spark.shuffle.spill to
  false.
  
   3. PySpark uses a new heuristic for determining the parallelism of
   shuffle operations.
   -- Old behavior can be restored by setting
   spark.default.parallelism to the number of cores in the cluster.
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Xiangrui Meng

+1. Tested some MLlib example code.

For default changes, maybe it is useful to mention the default
broadcast factory changed to torrent.

On Wed, Sep 3, 2014 at 12:34 AM, Michael Armbrust
mich...@databricks.com wrote:
 +1


 On Wed, Sep 3, 2014 at 12:29 AM, Reynold Xin r...@databricks.com wrote:

 +1

 Tested locally on Mac OS X with local-cluster mode.




 On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com
 wrote:

  I'll kick it off with a +1
 
  On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com
  wrote:
   Please vote on releasing the following candidate as Apache Spark
 version
  1.1.0!
  
   The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
  
 
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
  
   The release files, including signatures, digests, etc. can be found at:
   http://people.apache.org/~pwendell/spark-1.1.0-rc4/
  
   Release artifacts are signed with the following key:
   https://people.apache.org/keys/committer/pwendell.asc
  
   The staging repository for this release can be found at:
  
 https://repository.apache.org/content/repositories/orgapachespark-1031/
  
   The documentation corresponding to this release can be found at:
   http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
  
   Please vote on releasing this package as Apache Spark 1.1.0!
  
   The vote is open until Saturday, September 06, at 08:30 UTC and passes
 if
   a majority of at least 3 +1 PMC votes are cast.
  
   [ ] +1 Release this package as Apache Spark 1.1.0
   [ ] -1 Do not release this package because ...
  
   To learn more about Apache Spark, please see
   http://spark.apache.org/
  
   == Regressions fixed since RC3 ==
   SPARK-3332 - Issue with tagging in EC2 scripts
   SPARK-3358 - Issue with regression for m3.XX instances
  
   == What justifies a -1 vote for this release? ==
   This vote is happening very late into the QA period compared with
   previous votes, so -1 votes should only occur for significant
   regressions from 1.0.2. Bugs already present in 1.0.X will not block
   this release.
  
   == What default changes should I be aware of? ==
   1. The default value of spark.io.compression.codec is now snappy
   -- Old behavior can be restored by switching to lzf
  
   2. PySpark now performs external spilling during aggregations.
   -- Old behavior can be restored by setting spark.shuffle.spill to
  false.
  
   3. PySpark uses a new heuristic for determining the parallelism of
   shuffle operations.
   -- Old behavior can be restored by setting
   spark.default.parallelism to the number of cores in the cluster.
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Andrew Or

+1 Tested on Yarn and Windows. Also verified that standalone cluster mode
is now fixed.

2014-09-03 1:25 GMT-07:00 Xiangrui Meng men...@gmail.com:

+1. Tested some MLlib example code.

For default changes, maybe it is useful to mention the default
broadcast factory changed to torrent.

On Wed, Sep 3, 2014 at 12:34 AM, Michael Armbrust
mich...@databricks.com wrote:
+1

On Wed, Sep 3, 2014 at 12:29 AM, Reynold Xin r...@databricks.com
wrote:

Tested locally on Mac OS X with local-cluster mode.

On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell pwend...@gmail.com
wrote:

I'll kick it off with a +1