Re: [RESULT] [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-10 Thread Patrick Wendell
Hey just a heads up to everyone - running a bit behind on getting the
final artifacts and notes up. Finalizing this release was much more
complicated than previous ones due to new binary formats (we need to
redesign the download page a bit for this to work) and the large
increase in contributor count. Next time we can pipeline this work to
avoid a delay.

I did cut the v1.1.0 tag today. We should be able to do the full
announce tomorrow.

Thanks,
Patrick

On Sun, Sep 7, 2014 at 5:50 PM, Patrick Wendell  wrote:
> This vote passes with 8 binding +1 votes and no -1 votes. I'll post
> the final release in the next 48 hours... just finishing the release
> notes and packaging (which now takes a long time given the number of
> contributors!).
>
> +1:
> Reynold Xin*
> Michael Armbrust*
> Xiangrui Meng*
> Andrew Or*
> Sean Owen
> Matthew Farrellee
> Marcelo Vanzin
> Josh Rosen*
> Cheng Lian
> Mubarak Seyed
> Matei Zaharia*
> Nan Zhu
> Jeremy Freeman
> Denny Lee
> Tom Graves*
> Henry Saputra
> Egor Pahomov
> Rohit Sinha
> Kan Zhang
> Tathagata Das*
> Reza Zadeh
>
> -1:
>
> 0:
>
> * = binding

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



[RESULT] [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-07 Thread Patrick Wendell
This vote passes with 8 binding +1 votes and no -1 votes. I'll post
the final release in the next 48 hours... just finishing the release
notes and packaging (which now takes a long time given the number of
contributors!).

+1:
Reynold Xin*
Michael Armbrust*
Xiangrui Meng*
Andrew Or*
Sean Owen
Matthew Farrellee
Marcelo Vanzin
Josh Rosen*
Cheng Lian
Mubarak Seyed
Matei Zaharia*
Nan Zhu
Jeremy Freeman
Denny Lee
Tom Graves*
Henry Saputra
Egor Pahomov
Rohit Sinha
Kan Zhang
Tathagata Das*
Reza Zadeh

-1:

0:

* = binding

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-06 Thread Reza Zadeh
+1
Tested recently merged mllib matrix multiplication bugfix



On Sat, Sep 6, 2014 at 2:35 PM, Tathagata Das 
wrote:

> +1
>
> Tested streaming integration with flume on a local test bed.
>
>
> On Thu, Sep 4, 2014 at 6:08 PM, Kan Zhang  wrote:
>
> > +1
> >
> > Compiled, ran newly-introduced PySpark Hadoop input/output examples.
> >
> >
> > On Thu, Sep 4, 2014 at 1:10 PM, Egor Pahomov 
> > wrote:
> >
> > > +1
> > >
> > > Compiled, ran on yarn-hadoop-2.3 simple job.
> > >
> > >
> > > 2014-09-04 22:22 GMT+04:00 Henry Saputra :
> > >
> > > > LICENSE and NOTICE files are good
> > > > Hash files are good
> > > > Signature files are good
> > > > No 3rd parties executables
> > > > Source compiled
> > > > Run local and standalone tests
> > > > Test persist off heap with Tachyon looks good
> > > >
> > > > +1
> > > >
> > > > - Henry
> > > >
> > > > On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell  >
> > > > wrote:
> > > > > Please vote on releasing the following candidate as Apache Spark
> > > version
> > > > 1.1.0!
> > > > >
> > > > > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> > > > >
> > > >
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
> > > > >
> > > > > The release files, including signatures, digests, etc. can be found
> > at:
> > > > > http://people.apache.org/~pwendell/spark-1.1.0-rc4/
> > > > >
> > > > > Release artifacts are signed with the following key:
> > > > > https://people.apache.org/keys/committer/pwendell.asc
> > > > >
> > > > > The staging repository for this release can be found at:
> > > > >
> > >
> https://repository.apache.org/content/repositories/orgapachespark-1031/
> > > > >
> > > > > The documentation corresponding to this release can be found at:
> > > > > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
> > > > >
> > > > > Please vote on releasing this package as Apache Spark 1.1.0!
> > > > >
> > > > > The vote is open until Saturday, September 06, at 08:30 UTC and
> > passes
> > > if
> > > > > a majority of at least 3 +1 PMC votes are cast.
> > > > >
> > > > > [ ] +1 Release this package as Apache Spark 1.1.0
> > > > > [ ] -1 Do not release this package because ...
> > > > >
> > > > > To learn more about Apache Spark, please see
> > > > > http://spark.apache.org/
> > > > >
> > > > > == Regressions fixed since RC3 ==
> > > > > SPARK-3332 - Issue with tagging in EC2 scripts
> > > > > SPARK-3358 - Issue with regression for m3.XX instances
> > > > >
> > > > > == What justifies a -1 vote for this release? ==
> > > > > This vote is happening very late into the QA period compared with
> > > > > previous votes, so -1 votes should only occur for significant
> > > > > regressions from 1.0.2. Bugs already present in 1.0.X will not
> block
> > > > > this release.
> > > > >
> > > > > == What default changes should I be aware of? ==
> > > > > 1. The default value of "spark.io.compression.codec" is now
> "snappy"
> > > > > --> Old behavior can be restored by switching to "lzf"
> > > > >
> > > > > 2. PySpark now performs external spilling during aggregations.
> > > > > --> Old behavior can be restored by setting "spark.shuffle.spill"
> to
> > > > "false".
> > > > >
> > > > > 3. PySpark uses a new heuristic for determining the parallelism of
> > > > > shuffle operations.
> > > > > --> Old behavior can be restored by setting
> > > > > "spark.default.parallelism" to the number of cores in the cluster.
> > > > >
> > > > >
> -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > > > > For additional commands, e-mail: dev-h...@spark.apache.org
> > > > >
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > > > For additional commands, e-mail: dev-h...@spark.apache.org
> > > >
> > > >
> > >
> > >
> > > --
> > >
> > >
> > >
> > > *Sincerely yoursEgor PakhomovScala Developer, Yandex*
> > >
> >
>


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-06 Thread Tathagata Das
+1

Tested streaming integration with flume on a local test bed.


On Thu, Sep 4, 2014 at 6:08 PM, Kan Zhang  wrote:

> +1
>
> Compiled, ran newly-introduced PySpark Hadoop input/output examples.
>
>
> On Thu, Sep 4, 2014 at 1:10 PM, Egor Pahomov 
> wrote:
>
> > +1
> >
> > Compiled, ran on yarn-hadoop-2.3 simple job.
> >
> >
> > 2014-09-04 22:22 GMT+04:00 Henry Saputra :
> >
> > > LICENSE and NOTICE files are good
> > > Hash files are good
> > > Signature files are good
> > > No 3rd parties executables
> > > Source compiled
> > > Run local and standalone tests
> > > Test persist off heap with Tachyon looks good
> > >
> > > +1
> > >
> > > - Henry
> > >
> > > On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell 
> > > wrote:
> > > > Please vote on releasing the following candidate as Apache Spark
> > version
> > > 1.1.0!
> > > >
> > > > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> > > >
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
> > > >
> > > > The release files, including signatures, digests, etc. can be found
> at:
> > > > http://people.apache.org/~pwendell/spark-1.1.0-rc4/
> > > >
> > > > Release artifacts are signed with the following key:
> > > > https://people.apache.org/keys/committer/pwendell.asc
> > > >
> > > > The staging repository for this release can be found at:
> > > >
> > https://repository.apache.org/content/repositories/orgapachespark-1031/
> > > >
> > > > The documentation corresponding to this release can be found at:
> > > > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
> > > >
> > > > Please vote on releasing this package as Apache Spark 1.1.0!
> > > >
> > > > The vote is open until Saturday, September 06, at 08:30 UTC and
> passes
> > if
> > > > a majority of at least 3 +1 PMC votes are cast.
> > > >
> > > > [ ] +1 Release this package as Apache Spark 1.1.0
> > > > [ ] -1 Do not release this package because ...
> > > >
> > > > To learn more about Apache Spark, please see
> > > > http://spark.apache.org/
> > > >
> > > > == Regressions fixed since RC3 ==
> > > > SPARK-3332 - Issue with tagging in EC2 scripts
> > > > SPARK-3358 - Issue with regression for m3.XX instances
> > > >
> > > > == What justifies a -1 vote for this release? ==
> > > > This vote is happening very late into the QA period compared with
> > > > previous votes, so -1 votes should only occur for significant
> > > > regressions from 1.0.2. Bugs already present in 1.0.X will not block
> > > > this release.
> > > >
> > > > == What default changes should I be aware of? ==
> > > > 1. The default value of "spark.io.compression.codec" is now "snappy"
> > > > --> Old behavior can be restored by switching to "lzf"
> > > >
> > > > 2. PySpark now performs external spilling during aggregations.
> > > > --> Old behavior can be restored by setting "spark.shuffle.spill" to
> > > "false".
> > > >
> > > > 3. PySpark uses a new heuristic for determining the parallelism of
> > > > shuffle operations.
> > > > --> Old behavior can be restored by setting
> > > > "spark.default.parallelism" to the number of cores in the cluster.
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > > > For additional commands, e-mail: dev-h...@spark.apache.org
> > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > > For additional commands, e-mail: dev-h...@spark.apache.org
> > >
> > >
> >
> >
> > --
> >
> >
> >
> > *Sincerely yoursEgor PakhomovScala Developer, Yandex*
> >
>


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Kan Zhang
+1

Compiled, ran newly-introduced PySpark Hadoop input/output examples.


On Thu, Sep 4, 2014 at 1:10 PM, Egor Pahomov  wrote:

> +1
>
> Compiled, ran on yarn-hadoop-2.3 simple job.
>
>
> 2014-09-04 22:22 GMT+04:00 Henry Saputra :
>
> > LICENSE and NOTICE files are good
> > Hash files are good
> > Signature files are good
> > No 3rd parties executables
> > Source compiled
> > Run local and standalone tests
> > Test persist off heap with Tachyon looks good
> >
> > +1
> >
> > - Henry
> >
> > On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell 
> > wrote:
> > > Please vote on releasing the following candidate as Apache Spark
> version
> > 1.1.0!
> > >
> > > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
> > >
> > > The release files, including signatures, digests, etc. can be found at:
> > > http://people.apache.org/~pwendell/spark-1.1.0-rc4/
> > >
> > > Release artifacts are signed with the following key:
> > > https://people.apache.org/keys/committer/pwendell.asc
> > >
> > > The staging repository for this release can be found at:
> > >
> https://repository.apache.org/content/repositories/orgapachespark-1031/
> > >
> > > The documentation corresponding to this release can be found at:
> > > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
> > >
> > > Please vote on releasing this package as Apache Spark 1.1.0!
> > >
> > > The vote is open until Saturday, September 06, at 08:30 UTC and passes
> if
> > > a majority of at least 3 +1 PMC votes are cast.
> > >
> > > [ ] +1 Release this package as Apache Spark 1.1.0
> > > [ ] -1 Do not release this package because ...
> > >
> > > To learn more about Apache Spark, please see
> > > http://spark.apache.org/
> > >
> > > == Regressions fixed since RC3 ==
> > > SPARK-3332 - Issue with tagging in EC2 scripts
> > > SPARK-3358 - Issue with regression for m3.XX instances
> > >
> > > == What justifies a -1 vote for this release? ==
> > > This vote is happening very late into the QA period compared with
> > > previous votes, so -1 votes should only occur for significant
> > > regressions from 1.0.2. Bugs already present in 1.0.X will not block
> > > this release.
> > >
> > > == What default changes should I be aware of? ==
> > > 1. The default value of "spark.io.compression.codec" is now "snappy"
> > > --> Old behavior can be restored by switching to "lzf"
> > >
> > > 2. PySpark now performs external spilling during aggregations.
> > > --> Old behavior can be restored by setting "spark.shuffle.spill" to
> > "false".
> > >
> > > 3. PySpark uses a new heuristic for determining the parallelism of
> > > shuffle operations.
> > > --> Old behavior can be restored by setting
> > > "spark.default.parallelism" to the number of cores in the cluster.
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > > For additional commands, e-mail: dev-h...@spark.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > For additional commands, e-mail: dev-h...@spark.apache.org
> >
> >
>
>
> --
>
>
>
> *Sincerely yoursEgor PakhomovScala Developer, Yandex*
>


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread randomuser54
+1



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-1-0-RC4-tp8219p8278.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Nicholas Chammas
On Thu, Sep 4, 2014 at 1:50 PM, Gurvinder Singh 
wrote:

> There is a regression when using pyspark to read data
> from HDFS.
>

Could you open a JIRA  with a brief repro?
We'll look into it.

(You could also provide a repro in a separate thread.)

Nick


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Egor Pahomov
+1

Compiled, ran on yarn-hadoop-2.3 simple job.


2014-09-04 22:22 GMT+04:00 Henry Saputra :

> LICENSE and NOTICE files are good
> Hash files are good
> Signature files are good
> No 3rd parties executables
> Source compiled
> Run local and standalone tests
> Test persist off heap with Tachyon looks good
>
> +1
>
> - Henry
>
> On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell 
> wrote:
> > Please vote on releasing the following candidate as Apache Spark version
> 1.1.0!
> >
> > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> >
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
> >
> > The release files, including signatures, digests, etc. can be found at:
> > http://people.apache.org/~pwendell/spark-1.1.0-rc4/
> >
> > Release artifacts are signed with the following key:
> > https://people.apache.org/keys/committer/pwendell.asc
> >
> > The staging repository for this release can be found at:
> > https://repository.apache.org/content/repositories/orgapachespark-1031/
> >
> > The documentation corresponding to this release can be found at:
> > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
> >
> > Please vote on releasing this package as Apache Spark 1.1.0!
> >
> > The vote is open until Saturday, September 06, at 08:30 UTC and passes if
> > a majority of at least 3 +1 PMC votes are cast.
> >
> > [ ] +1 Release this package as Apache Spark 1.1.0
> > [ ] -1 Do not release this package because ...
> >
> > To learn more about Apache Spark, please see
> > http://spark.apache.org/
> >
> > == Regressions fixed since RC3 ==
> > SPARK-3332 - Issue with tagging in EC2 scripts
> > SPARK-3358 - Issue with regression for m3.XX instances
> >
> > == What justifies a -1 vote for this release? ==
> > This vote is happening very late into the QA period compared with
> > previous votes, so -1 votes should only occur for significant
> > regressions from 1.0.2. Bugs already present in 1.0.X will not block
> > this release.
> >
> > == What default changes should I be aware of? ==
> > 1. The default value of "spark.io.compression.codec" is now "snappy"
> > --> Old behavior can be restored by switching to "lzf"
> >
> > 2. PySpark now performs external spilling during aggregations.
> > --> Old behavior can be restored by setting "spark.shuffle.spill" to
> "false".
> >
> > 3. PySpark uses a new heuristic for determining the parallelism of
> > shuffle operations.
> > --> Old behavior can be restored by setting
> > "spark.default.parallelism" to the number of cores in the cluster.
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > For additional commands, e-mail: dev-h...@spark.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


-- 



*Sincerely yoursEgor PakhomovScala Developer, Yandex*


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Henry Saputra
LICENSE and NOTICE files are good
Hash files are good
Signature files are good
No 3rd parties executables
Source compiled
Run local and standalone tests
Test persist off heap with Tachyon looks good

+1

- Henry

On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell  wrote:
> Please vote on releasing the following candidate as Apache Spark version 
> 1.1.0!
>
> The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-1.1.0-rc4/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1031/
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
>
> Please vote on releasing this package as Apache Spark 1.1.0!
>
> The vote is open until Saturday, September 06, at 08:30 UTC and passes if
> a majority of at least 3 +1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 1.1.0
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see
> http://spark.apache.org/
>
> == Regressions fixed since RC3 ==
> SPARK-3332 - Issue with tagging in EC2 scripts
> SPARK-3358 - Issue with regression for m3.XX instances
>
> == What justifies a -1 vote for this release? ==
> This vote is happening very late into the QA period compared with
> previous votes, so -1 votes should only occur for significant
> regressions from 1.0.2. Bugs already present in 1.0.X will not block
> this release.
>
> == What default changes should I be aware of? ==
> 1. The default value of "spark.io.compression.codec" is now "snappy"
> --> Old behavior can be restored by switching to "lzf"
>
> 2. PySpark now performs external spilling during aggregations.
> --> Old behavior can be restored by setting "spark.shuffle.spill" to "false".
>
> 3. PySpark uses a new heuristic for determining the parallelism of
> shuffle operations.
> --> Old behavior can be restored by setting
> "spark.default.parallelism" to the number of cores in the cluster.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Gurvinder Singh
On 09/03/2014 04:23 PM, Nicholas Chammas wrote:
> On Wed, Sep 3, 2014 at 3:24 AM, Patrick Wendell  wrote:
> 
>> == What default changes should I be aware of? ==
>> 1. The default value of "spark.io.compression.codec" is now "snappy"
>> --> Old behavior can be restored by switching to "lzf"
>>
>> 2. PySpark now performs external spilling during aggregations.
>> --> Old behavior can be restored by setting "spark.shuffle.spill" to
>> "false".
>>
>> 3. PySpark uses a new heuristic for determining the parallelism of
>> shuffle operations.
>> --> Old behavior can be restored by setting
>> "spark.default.parallelism" to the number of cores in the cluster.
>>
> 
> Will these changes be called out in the release notes or somewhere in the
> docs?
> 
> That last one (which I believe is what we discovered as the result of
> SPARK- ) could have a
> large impact on PySpark users.

Just wanted to add, it might be related to this issue or different.
There is a regression when using pyspark to read data
from HDFS. its performance during map tasks has gone down approx 1 ->
0.5x. I have tested the 1.0.2 and the performance was fine, but the 1.1
release candidate has this issue. I tested by setting the following
properties to make sure it was not due to these.

set("spark.io.compression.codec","lzf").set("spark.shuffle.spill","false")

in conf object.

Regards,
Gurvinder
> 
> Nick
> 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Tom Graves
+1. Ran spark on yarn on hadoop 0.23 and 2.x.

Tom


On Wednesday, September 3, 2014 2:25 AM, Patrick Wendell  
wrote:
 


Please vote on releasing the following candidate as Apache Spark version 1.1.0!

The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1031/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/

Please vote on releasing this package as Apache Spark 1.1.0!

The vote is open until Saturday, September 06, at 08:30 UTC and passes if
a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== Regressions fixed since RC3 ==
SPARK-3332 - Issue with tagging in EC2 scripts
SPARK-3358 - Issue with regression for m3.XX instances

== What justifies a -1 vote for this release? ==
This vote is happening very late into the QA period compared with
previous votes, so -1 votes should only occur for significant
regressions from 1.0.2. Bugs already present in 1.0.X will not block
this release.

== What
 default changes should I be aware of? ==
1. The default value of "spark.io.compression.codec" is now "snappy"
--> Old behavior can be restored by switching to "lzf"

2. PySpark now performs external spilling during aggregations.
--> Old behavior can be restored by setting "spark.shuffle.spill" to "false".

3. PySpark uses a new heuristic for determining the parallelism of
shuffle operations.
--> Old behavior can be restored by setting
"spark.default.parallelism" to the number of cores in the cluster.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Denny Lee
+1

on OSX Yosemite, built with Hadoop 2.4.1, Hive 0.12 testing SparkSQL,
Thrift, MySQL metastore



On Wed, Sep 3, 2014 at 4:02 PM, Jeremy Freeman 
wrote:

> +1
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-1-0-RC4-tp8219p8254.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Jeremy Freeman
+1



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-1-0-RC4-tp8219p8254.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Nan Zhu
+1 tested thrift server with our in-house application, everything works fine 

-- 
Nan Zhu


On Wednesday, September 3, 2014 at 4:43 PM, Matei Zaharia wrote:

> +1
> 
> Matei
> 
> On September 3, 2014 at 12:24:32 PM, Cheng Lian (lian.cs@gmail.com 
> (mailto:lian.cs@gmail.com)) wrote:
> 
> +1. 
> 
> Tested locally on OSX 10.9, built with Hadoop 2.4.1 
> 
> - Checked Datanucleus jar files 
> - Tested Spark SQL Thrift server and CLI under local mode and standalone 
> cluster against MySQL backed metastore 
> 
> 
> 
> On Wed, Sep 3, 2014 at 11:25 AM, Josh Rosen  (mailto:rosenvi...@gmail.com)> wrote: 
> 
> > +1. Tested on Windows and EC2. Confirmed that the EC2 pvm->hvm switch 
> > fixed the SPARK-3358 regression. 
> > 
> > 
> > On September 3, 2014 at 10:33:45 AM, Marcelo Vanzin (van...@cloudera.com 
> > (mailto:van...@cloudera.com)) 
> > wrote: 
> > 
> > +1 (non-binding) 
> > 
> > - checked checksums of a few packages 
> > - ran few jobs against yarn client/cluster using hadoop2.3 package 
> > - played with spark-shell in yarn-client mode 
> > 
> > On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell  > (mailto:pwend...@gmail.com)> 
> > wrote: 
> > > Please vote on releasing the following candidate as Apache Spark version 
> > 
> > 1.1.0! 
> > > 
> > > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): 
> > https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
> >  
> > > 
> > > The release files, including signatures, digests, etc. can be found at: 
> > > http://people.apache.org/~pwendell/spark-1.1.0-rc4/ 
> > > 
> > > Release artifacts are signed with the following key: 
> > > https://people.apache.org/keys/committer/pwendell.asc 
> > > 
> > > The staging repository for this release can be found at: 
> > > https://repository.apache.org/content/repositories/orgapachespark-1031/ 
> > > 
> > > The documentation corresponding to this release can be found at: 
> > > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ 
> > > 
> > > Please vote on releasing this package as Apache Spark 1.1.0! 
> > > 
> > > The vote is open until Saturday, September 06, at 08:30 UTC and passes if 
> > > a majority of at least 3 +1 PMC votes are cast. 
> > > 
> > > [ ] +1 Release this package as Apache Spark 1.1.0 
> > > [ ] -1 Do not release this package because ... 
> > > 
> > > To learn more about Apache Spark, please see 
> > > http://spark.apache.org/ 
> > > 
> > > == Regressions fixed since RC3 == 
> > > SPARK-3332 - Issue with tagging in EC2 scripts 
> > > SPARK-3358 - Issue with regression for m3.XX instances 
> > > 
> > > == What justifies a -1 vote for this release? == 
> > > This vote is happening very late into the QA period compared with 
> > > previous votes, so -1 votes should only occur for significant 
> > > regressions from 1.0.2. Bugs already present in 1.0.X will not block 
> > > this release. 
> > > 
> > > == What default changes should I be aware of? == 
> > > 1. The default value of "spark.io.compression.codec" is now "snappy" 
> > > --> Old behavior can be restored by switching to "lzf" 
> > > 
> > > 2. PySpark now performs external spilling during aggregations. 
> > > --> Old behavior can be restored by setting "spark.shuffle.spill" to 
> > > 
> > 
> > "false". 
> > > 
> > > 3. PySpark uses a new heuristic for determining the parallelism of 
> > > shuffle operations. 
> > > --> Old behavior can be restored by setting 
> > > "spark.default.parallelism" to the number of cores in the cluster. 
> > > 
> > > - 
> > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org 
> > > (mailto:dev-unsubscr...@spark.apache.org) 
> > > For additional commands, e-mail: dev-h...@spark.apache.org 
> > > (mailto:dev-h...@spark.apache.org) 
> > > 
> > 
> > 
> > 
> > 
> > -- 
> > Marcelo 
> > 
> > - 
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org 
> > (mailto:dev-unsubscr...@spark.apache.org) 
> > For additional commands, e-mail: dev-h...@spark.apache.org 
> > (mailto:dev-h...@spark.apache.org) 
> > 
> 
> 
> 




Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Matei Zaharia
+1

Matei

On September 3, 2014 at 12:24:32 PM, Cheng Lian (lian.cs@gmail.com) wrote:

+1. 

Tested locally on OSX 10.9, built with Hadoop 2.4.1 

- Checked Datanucleus jar files 
- Tested Spark SQL Thrift server and CLI under local mode and standalone 
cluster against MySQL backed metastore 



On Wed, Sep 3, 2014 at 11:25 AM, Josh Rosen  wrote: 

> +1. Tested on Windows and EC2. Confirmed that the EC2 pvm->hvm switch 
> fixed the SPARK-3358 regression. 
> 
> 
> On September 3, 2014 at 10:33:45 AM, Marcelo Vanzin (van...@cloudera.com) 
> wrote: 
> 
> +1 (non-binding) 
> 
> - checked checksums of a few packages 
> - ran few jobs against yarn client/cluster using hadoop2.3 package 
> - played with spark-shell in yarn-client mode 
> 
> On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell  
> wrote: 
> > Please vote on releasing the following candidate as Apache Spark version 
> 1.1.0! 
> > 
> > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): 
> > 
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
>  
> > 
> > The release files, including signatures, digests, etc. can be found at: 
> > http://people.apache.org/~pwendell/spark-1.1.0-rc4/ 
> > 
> > Release artifacts are signed with the following key: 
> > https://people.apache.org/keys/committer/pwendell.asc 
> > 
> > The staging repository for this release can be found at: 
> > https://repository.apache.org/content/repositories/orgapachespark-1031/ 
> > 
> > The documentation corresponding to this release can be found at: 
> > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/ 
> > 
> > Please vote on releasing this package as Apache Spark 1.1.0! 
> > 
> > The vote is open until Saturday, September 06, at 08:30 UTC and passes if 
> > a majority of at least 3 +1 PMC votes are cast. 
> > 
> > [ ] +1 Release this package as Apache Spark 1.1.0 
> > [ ] -1 Do not release this package because ... 
> > 
> > To learn more about Apache Spark, please see 
> > http://spark.apache.org/ 
> > 
> > == Regressions fixed since RC3 == 
> > SPARK-3332 - Issue with tagging in EC2 scripts 
> > SPARK-3358 - Issue with regression for m3.XX instances 
> > 
> > == What justifies a -1 vote for this release? == 
> > This vote is happening very late into the QA period compared with 
> > previous votes, so -1 votes should only occur for significant 
> > regressions from 1.0.2. Bugs already present in 1.0.X will not block 
> > this release. 
> > 
> > == What default changes should I be aware of? == 
> > 1. The default value of "spark.io.compression.codec" is now "snappy" 
> > --> Old behavior can be restored by switching to "lzf" 
> > 
> > 2. PySpark now performs external spilling during aggregations. 
> > --> Old behavior can be restored by setting "spark.shuffle.spill" to 
> "false". 
> > 
> > 3. PySpark uses a new heuristic for determining the parallelism of 
> > shuffle operations. 
> > --> Old behavior can be restored by setting 
> > "spark.default.parallelism" to the number of cores in the cluster. 
> > 
> > - 
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org 
> > For additional commands, e-mail: dev-h...@spark.apache.org 
> > 
> 
> 
> 
> -- 
> Marcelo 
> 
> - 
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org 
> For additional commands, e-mail: dev-h...@spark.apache.org 
> 
> 


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Mubarak Seyed
+1 (non-binding)

Tested locally on Mac OS X with local-cluster mode.


On Wed, Sep 3, 2014 at 12:23 PM, Cheng Lian  wrote:

> +1.
>
> Tested locally on OSX 10.9, built with Hadoop 2.4.1
>
> - Checked Datanucleus jar files
> - Tested Spark SQL Thrift server and CLI under local mode and standalone
> cluster against MySQL backed metastore
>
>
>
> On Wed, Sep 3, 2014 at 11:25 AM, Josh Rosen  wrote:
>
> > +1.  Tested on Windows and EC2.  Confirmed that the EC2 pvm->hvm switch
> > fixed the SPARK-3358 regression.
> >
> >
> > On September 3, 2014 at 10:33:45 AM, Marcelo Vanzin (van...@cloudera.com
> )
> > wrote:
> >
> > +1 (non-binding)
> >
> > - checked checksums of a few packages
> > - ran few jobs against yarn client/cluster using hadoop2.3 package
> > - played with spark-shell in yarn-client mode
> >
> > On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell 
> > wrote:
> > > Please vote on releasing the following candidate as Apache Spark
> version
> > 1.1.0!
> > >
> > > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
> > >
> > > The release files, including signatures, digests, etc. can be found at:
> > > http://people.apache.org/~pwendell/spark-1.1.0-rc4/
> > >
> > > Release artifacts are signed with the following key:
> > > https://people.apache.org/keys/committer/pwendell.asc
> > >
> > > The staging repository for this release can be found at:
> > >
> https://repository.apache.org/content/repositories/orgapachespark-1031/
> > >
> > > The documentation corresponding to this release can be found at:
> > > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
> > >
> > > Please vote on releasing this package as Apache Spark 1.1.0!
> > >
> > > The vote is open until Saturday, September 06, at 08:30 UTC and passes
> if
> > > a majority of at least 3 +1 PMC votes are cast.
> > >
> > > [ ] +1 Release this package as Apache Spark 1.1.0
> > > [ ] -1 Do not release this package because ...
> > >
> > > To learn more about Apache Spark, please see
> > > http://spark.apache.org/
> > >
> > > == Regressions fixed since RC3 ==
> > > SPARK-3332 - Issue with tagging in EC2 scripts
> > > SPARK-3358 - Issue with regression for m3.XX instances
> > >
> > > == What justifies a -1 vote for this release? ==
> > > This vote is happening very late into the QA period compared with
> > > previous votes, so -1 votes should only occur for significant
> > > regressions from 1.0.2. Bugs already present in 1.0.X will not block
> > > this release.
> > >
> > > == What default changes should I be aware of? ==
> > > 1. The default value of "spark.io.compression.codec" is now "snappy"
> > > --> Old behavior can be restored by switching to "lzf"
> > >
> > > 2. PySpark now performs external spilling during aggregations.
> > > --> Old behavior can be restored by setting "spark.shuffle.spill" to
> > "false".
> > >
> > > 3. PySpark uses a new heuristic for determining the parallelism of
> > > shuffle operations.
> > > --> Old behavior can be restored by setting
> > > "spark.default.parallelism" to the number of cores in the cluster.
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > > For additional commands, e-mail: dev-h...@spark.apache.org
> > >
> >
> >
> >
> > --
> > Marcelo
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > For additional commands, e-mail: dev-h...@spark.apache.org
> >
> >
>


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Cheng Lian
+1.

Tested locally on OSX 10.9, built with Hadoop 2.4.1

- Checked Datanucleus jar files
- Tested Spark SQL Thrift server and CLI under local mode and standalone
cluster against MySQL backed metastore



On Wed, Sep 3, 2014 at 11:25 AM, Josh Rosen  wrote:

> +1.  Tested on Windows and EC2.  Confirmed that the EC2 pvm->hvm switch
> fixed the SPARK-3358 regression.
>
>
> On September 3, 2014 at 10:33:45 AM, Marcelo Vanzin (van...@cloudera.com)
> wrote:
>
> +1 (non-binding)
>
> - checked checksums of a few packages
> - ran few jobs against yarn client/cluster using hadoop2.3 package
> - played with spark-shell in yarn-client mode
>
> On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell 
> wrote:
> > Please vote on releasing the following candidate as Apache Spark version
> 1.1.0!
> >
> > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> >
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
> >
> > The release files, including signatures, digests, etc. can be found at:
> > http://people.apache.org/~pwendell/spark-1.1.0-rc4/
> >
> > Release artifacts are signed with the following key:
> > https://people.apache.org/keys/committer/pwendell.asc
> >
> > The staging repository for this release can be found at:
> > https://repository.apache.org/content/repositories/orgapachespark-1031/
> >
> > The documentation corresponding to this release can be found at:
> > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
> >
> > Please vote on releasing this package as Apache Spark 1.1.0!
> >
> > The vote is open until Saturday, September 06, at 08:30 UTC and passes if
> > a majority of at least 3 +1 PMC votes are cast.
> >
> > [ ] +1 Release this package as Apache Spark 1.1.0
> > [ ] -1 Do not release this package because ...
> >
> > To learn more about Apache Spark, please see
> > http://spark.apache.org/
> >
> > == Regressions fixed since RC3 ==
> > SPARK-3332 - Issue with tagging in EC2 scripts
> > SPARK-3358 - Issue with regression for m3.XX instances
> >
> > == What justifies a -1 vote for this release? ==
> > This vote is happening very late into the QA period compared with
> > previous votes, so -1 votes should only occur for significant
> > regressions from 1.0.2. Bugs already present in 1.0.X will not block
> > this release.
> >
> > == What default changes should I be aware of? ==
> > 1. The default value of "spark.io.compression.codec" is now "snappy"
> > --> Old behavior can be restored by switching to "lzf"
> >
> > 2. PySpark now performs external spilling during aggregations.
> > --> Old behavior can be restored by setting "spark.shuffle.spill" to
> "false".
> >
> > 3. PySpark uses a new heuristic for determining the parallelism of
> > shuffle operations.
> > --> Old behavior can be restored by setting
> > "spark.default.parallelism" to the number of cores in the cluster.
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > For additional commands, e-mail: dev-h...@spark.apache.org
> >
>
>
>
> --
> Marcelo
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Josh Rosen
+1.  Tested on Windows and EC2.  Confirmed that the EC2 pvm->hvm switch fixed 
the SPARK-3358 regression.


On September 3, 2014 at 10:33:45 AM, Marcelo Vanzin (van...@cloudera.com) wrote:

+1 (non-binding)  

- checked checksums of a few packages  
- ran few jobs against yarn client/cluster using hadoop2.3 package  
- played with spark-shell in yarn-client mode  

On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell  wrote:  
> Please vote on releasing the following candidate as Apache Spark version 
> 1.1.0!  
>  
> The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):  
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
>   
>  
> The release files, including signatures, digests, etc. can be found at:  
> http://people.apache.org/~pwendell/spark-1.1.0-rc4/  
>  
> Release artifacts are signed with the following key:  
> https://people.apache.org/keys/committer/pwendell.asc  
>  
> The staging repository for this release can be found at:  
> https://repository.apache.org/content/repositories/orgapachespark-1031/  
>  
> The documentation corresponding to this release can be found at:  
> http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/  
>  
> Please vote on releasing this package as Apache Spark 1.1.0!  
>  
> The vote is open until Saturday, September 06, at 08:30 UTC and passes if  
> a majority of at least 3 +1 PMC votes are cast.  
>  
> [ ] +1 Release this package as Apache Spark 1.1.0  
> [ ] -1 Do not release this package because ...  
>  
> To learn more about Apache Spark, please see  
> http://spark.apache.org/  
>  
> == Regressions fixed since RC3 ==  
> SPARK-3332 - Issue with tagging in EC2 scripts  
> SPARK-3358 - Issue with regression for m3.XX instances  
>  
> == What justifies a -1 vote for this release? ==  
> This vote is happening very late into the QA period compared with  
> previous votes, so -1 votes should only occur for significant  
> regressions from 1.0.2. Bugs already present in 1.0.X will not block  
> this release.  
>  
> == What default changes should I be aware of? ==  
> 1. The default value of "spark.io.compression.codec" is now "snappy"  
> --> Old behavior can be restored by switching to "lzf"  
>  
> 2. PySpark now performs external spilling during aggregations.  
> --> Old behavior can be restored by setting "spark.shuffle.spill" to "false". 
>  
>  
> 3. PySpark uses a new heuristic for determining the parallelism of  
> shuffle operations.  
> --> Old behavior can be restored by setting  
> "spark.default.parallelism" to the number of cores in the cluster.  
>  
> -  
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org  
> For additional commands, e-mail: dev-h...@spark.apache.org  
>  



--  
Marcelo  

-  
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org  
For additional commands, e-mail: dev-h...@spark.apache.org  



Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Marcelo Vanzin
+1 (non-binding)

- checked checksums of a few packages
- ran few jobs against yarn client/cluster using hadoop2.3 package
- played with spark-shell in yarn-client mode

On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell  wrote:
> Please vote on releasing the following candidate as Apache Spark version 
> 1.1.0!
>
> The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-1.1.0-rc4/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1031/
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
>
> Please vote on releasing this package as Apache Spark 1.1.0!
>
> The vote is open until Saturday, September 06, at 08:30 UTC and passes if
> a majority of at least 3 +1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 1.1.0
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see
> http://spark.apache.org/
>
> == Regressions fixed since RC3 ==
> SPARK-3332 - Issue with tagging in EC2 scripts
> SPARK-3358 - Issue with regression for m3.XX instances
>
> == What justifies a -1 vote for this release? ==
> This vote is happening very late into the QA period compared with
> previous votes, so -1 votes should only occur for significant
> regressions from 1.0.2. Bugs already present in 1.0.X will not block
> this release.
>
> == What default changes should I be aware of? ==
> 1. The default value of "spark.io.compression.codec" is now "snappy"
> --> Old behavior can be restored by switching to "lzf"
>
> 2. PySpark now performs external spilling during aggregations.
> --> Old behavior can be restored by setting "spark.shuffle.spill" to "false".
>
> 3. PySpark uses a new heuristic for determining the parallelism of
> shuffle operations.
> --> Old behavior can be restored by setting
> "spark.default.parallelism" to the number of cores in the cluster.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>



-- 
Marcelo

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Patrick Wendell
Hey Nick,

Yeah we'll put those in the release notes.

On Wed, Sep 3, 2014 at 7:23 AM, Nicholas Chammas
 wrote:
> On Wed, Sep 3, 2014 at 3:24 AM, Patrick Wendell  wrote:
>>
>> == What default changes should I be aware of? ==
>> 1. The default value of "spark.io.compression.codec" is now "snappy"
>> --> Old behavior can be restored by switching to "lzf"
>>
>> 2. PySpark now performs external spilling during aggregations.
>> --> Old behavior can be restored by setting "spark.shuffle.spill" to
>> "false".
>>
>> 3. PySpark uses a new heuristic for determining the parallelism of
>> shuffle operations.
>> --> Old behavior can be restored by setting
>> "spark.default.parallelism" to the number of cores in the cluster.
>
>
> Will these changes be called out in the release notes or somewhere in the
> docs?
>
> That last one (which I believe is what we discovered as the result of
> SPARK-) could have a large impact on PySpark users.
>
> Nick

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Matthew Farrellee

+1

built from sha w/ make-distribution.sh
tested basic examples (0 data) w/ local on fedora 20 (openjdk 1.7, 
python 2.7.5)
tested detection and log processing (25GB data) w/ mesos (0.19.0) & nfs 
on rhel 7 (openjdk 1.7, python 2.7.5)


On 09/03/2014 03:24 AM, Patrick Wendell wrote:

Please vote on releasing the following candidate as Apache Spark version 1.1.0!

The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1031/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/

Please vote on releasing this package as Apache Spark 1.1.0!

The vote is open until Saturday, September 06, at 08:30 UTC and passes if
a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== Regressions fixed since RC3 ==
SPARK-3332 - Issue with tagging in EC2 scripts
SPARK-3358 - Issue with regression for m3.XX instances

== What justifies a -1 vote for this release? ==
This vote is happening very late into the QA period compared with
previous votes, so -1 votes should only occur for significant
regressions from 1.0.2. Bugs already present in 1.0.X will not block
this release.

== What default changes should I be aware of? ==
1. The default value of "spark.io.compression.codec" is now "snappy"
--> Old behavior can be restored by switching to "lzf"

2. PySpark now performs external spilling during aggregations.
--> Old behavior can be restored by setting "spark.shuffle.spill" to "false".

3. PySpark uses a new heuristic for determining the parallelism of
shuffle operations.
--> Old behavior can be restored by setting
"spark.default.parallelism" to the number of cores in the cluster.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Nicholas Chammas
On Wed, Sep 3, 2014 at 3:24 AM, Patrick Wendell  wrote:

> == What default changes should I be aware of? ==
> 1. The default value of "spark.io.compression.codec" is now "snappy"
> --> Old behavior can be restored by switching to "lzf"
>
> 2. PySpark now performs external spilling during aggregations.
> --> Old behavior can be restored by setting "spark.shuffle.spill" to
> "false".
>
> 3. PySpark uses a new heuristic for determining the parallelism of
> shuffle operations.
> --> Old behavior can be restored by setting
> "spark.default.parallelism" to the number of cores in the cluster.
>

Will these changes be called out in the release notes or somewhere in the
docs?

That last one (which I believe is what we discovered as the result of
SPARK- ) could have a
large impact on PySpark users.

Nick


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Sean Owen
+1 signatures still fine, tests still pass. On Mac OS X I get the
following failure but I think it's spurious. Only mentioning it to see
if anyone else sees it. It doesn't happen on Linux.

[error] Test 
org.apache.spark.streaming.kafka.JavaKafkaStreamSuite.testKafkaStream
failed: junit.framework.AssertionFailedError: expected:<3> but was:<0>
[error] at junit.framework.Assert.fail(Assert.java:50)
[error] at junit.framework.Assert.failNotEquals(Assert.java:287)
[error] at junit.framework.Assert.assertEquals(Assert.java:67)
[error] at junit.framework.Assert.assertEquals(Assert.java:199)
[error] at junit.framework.Assert.assertEquals(Assert.java:205)
[error] at 
org.apache.spark.streaming.kafka.JavaKafkaStreamSuite.testKafkaStream(JavaKafkaStreamSuite.java:129)
[error]

On Wed, Sep 3, 2014 at 8:24 AM, Patrick Wendell  wrote:
> Please vote on releasing the following candidate as Apache Spark version 
> 1.1.0!
>
> The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-1.1.0-rc4/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1031/
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
>
> Please vote on releasing this package as Apache Spark 1.1.0!
>
> The vote is open until Saturday, September 06, at 08:30 UTC and passes if
> a majority of at least 3 +1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 1.1.0
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see
> http://spark.apache.org/
>
> == Regressions fixed since RC3 ==
> SPARK-3332 - Issue with tagging in EC2 scripts
> SPARK-3358 - Issue with regression for m3.XX instances
>
> == What justifies a -1 vote for this release? ==
> This vote is happening very late into the QA period compared with
> previous votes, so -1 votes should only occur for significant
> regressions from 1.0.2. Bugs already present in 1.0.X will not block
> this release.
>
> == What default changes should I be aware of? ==
> 1. The default value of "spark.io.compression.codec" is now "snappy"
> --> Old behavior can be restored by switching to "lzf"
>
> 2. PySpark now performs external spilling during aggregations.
> --> Old behavior can be restored by setting "spark.shuffle.spill" to "false".
>
> 3. PySpark uses a new heuristic for determining the parallelism of
> shuffle operations.
> --> Old behavior can be restored by setting
> "spark.default.parallelism" to the number of cores in the cluster.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Andrew Or
+1 Tested on Yarn and Windows. Also verified that standalone cluster mode
is now fixed.


2014-09-03 1:25 GMT-07:00 Xiangrui Meng :

> +1. Tested some MLlib example code.
>
> For default changes, maybe it is useful to mention the default
> broadcast factory changed to torrent.
>
> On Wed, Sep 3, 2014 at 12:34 AM, Michael Armbrust
>  wrote:
> > +1
> >
> >
> > On Wed, Sep 3, 2014 at 12:29 AM, Reynold Xin 
> wrote:
> >
> >> +1
> >>
> >> Tested locally on Mac OS X with local-cluster mode.
> >>
> >>
> >>
> >>
> >> On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell 
> >> wrote:
> >>
> >> > I'll kick it off with a +1
> >> >
> >> > On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell 
> >> > wrote:
> >> > > Please vote on releasing the following candidate as Apache Spark
> >> version
> >> > 1.1.0!
> >> > >
> >> > > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> >> > >
> >> >
> >>
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
> >> > >
> >> > > The release files, including signatures, digests, etc. can be found
> at:
> >> > > http://people.apache.org/~pwendell/spark-1.1.0-rc4/
> >> > >
> >> > > Release artifacts are signed with the following key:
> >> > > https://people.apache.org/keys/committer/pwendell.asc
> >> > >
> >> > > The staging repository for this release can be found at:
> >> > >
> >> https://repository.apache.org/content/repositories/orgapachespark-1031/
> >> > >
> >> > > The documentation corresponding to this release can be found at:
> >> > > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
> >> > >
> >> > > Please vote on releasing this package as Apache Spark 1.1.0!
> >> > >
> >> > > The vote is open until Saturday, September 06, at 08:30 UTC and
> passes
> >> if
> >> > > a majority of at least 3 +1 PMC votes are cast.
> >> > >
> >> > > [ ] +1 Release this package as Apache Spark 1.1.0
> >> > > [ ] -1 Do not release this package because ...
> >> > >
> >> > > To learn more about Apache Spark, please see
> >> > > http://spark.apache.org/
> >> > >
> >> > > == Regressions fixed since RC3 ==
> >> > > SPARK-3332 - Issue with tagging in EC2 scripts
> >> > > SPARK-3358 - Issue with regression for m3.XX instances
> >> > >
> >> > > == What justifies a -1 vote for this release? ==
> >> > > This vote is happening very late into the QA period compared with
> >> > > previous votes, so -1 votes should only occur for significant
> >> > > regressions from 1.0.2. Bugs already present in 1.0.X will not block
> >> > > this release.
> >> > >
> >> > > == What default changes should I be aware of? ==
> >> > > 1. The default value of "spark.io.compression.codec" is now "snappy"
> >> > > --> Old behavior can be restored by switching to "lzf"
> >> > >
> >> > > 2. PySpark now performs external spilling during aggregations.
> >> > > --> Old behavior can be restored by setting "spark.shuffle.spill" to
> >> > "false".
> >> > >
> >> > > 3. PySpark uses a new heuristic for determining the parallelism of
> >> > > shuffle operations.
> >> > > --> Old behavior can be restored by setting
> >> > > "spark.default.parallelism" to the number of cores in the cluster.
> >> >
> >> > -
> >> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> >> > For additional commands, e-mail: dev-h...@spark.apache.org
> >> >
> >> >
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Xiangrui Meng
+1. Tested some MLlib example code.

For default changes, maybe it is useful to mention the default
broadcast factory changed to torrent.

On Wed, Sep 3, 2014 at 12:34 AM, Michael Armbrust
 wrote:
> +1
>
>
> On Wed, Sep 3, 2014 at 12:29 AM, Reynold Xin  wrote:
>
>> +1
>>
>> Tested locally on Mac OS X with local-cluster mode.
>>
>>
>>
>>
>> On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell 
>> wrote:
>>
>> > I'll kick it off with a +1
>> >
>> > On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell 
>> > wrote:
>> > > Please vote on releasing the following candidate as Apache Spark
>> version
>> > 1.1.0!
>> > >
>> > > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
>> > >
>> >
>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
>> > >
>> > > The release files, including signatures, digests, etc. can be found at:
>> > > http://people.apache.org/~pwendell/spark-1.1.0-rc4/
>> > >
>> > > Release artifacts are signed with the following key:
>> > > https://people.apache.org/keys/committer/pwendell.asc
>> > >
>> > > The staging repository for this release can be found at:
>> > >
>> https://repository.apache.org/content/repositories/orgapachespark-1031/
>> > >
>> > > The documentation corresponding to this release can be found at:
>> > > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
>> > >
>> > > Please vote on releasing this package as Apache Spark 1.1.0!
>> > >
>> > > The vote is open until Saturday, September 06, at 08:30 UTC and passes
>> if
>> > > a majority of at least 3 +1 PMC votes are cast.
>> > >
>> > > [ ] +1 Release this package as Apache Spark 1.1.0
>> > > [ ] -1 Do not release this package because ...
>> > >
>> > > To learn more about Apache Spark, please see
>> > > http://spark.apache.org/
>> > >
>> > > == Regressions fixed since RC3 ==
>> > > SPARK-3332 - Issue with tagging in EC2 scripts
>> > > SPARK-3358 - Issue with regression for m3.XX instances
>> > >
>> > > == What justifies a -1 vote for this release? ==
>> > > This vote is happening very late into the QA period compared with
>> > > previous votes, so -1 votes should only occur for significant
>> > > regressions from 1.0.2. Bugs already present in 1.0.X will not block
>> > > this release.
>> > >
>> > > == What default changes should I be aware of? ==
>> > > 1. The default value of "spark.io.compression.codec" is now "snappy"
>> > > --> Old behavior can be restored by switching to "lzf"
>> > >
>> > > 2. PySpark now performs external spilling during aggregations.
>> > > --> Old behavior can be restored by setting "spark.shuffle.spill" to
>> > "false".
>> > >
>> > > 3. PySpark uses a new heuristic for determining the parallelism of
>> > > shuffle operations.
>> > > --> Old behavior can be restored by setting
>> > > "spark.default.parallelism" to the number of cores in the cluster.
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> > For additional commands, e-mail: dev-h...@spark.apache.org
>> >
>> >
>>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Michael Armbrust
+1


On Wed, Sep 3, 2014 at 12:29 AM, Reynold Xin  wrote:

> +1
>
> Tested locally on Mac OS X with local-cluster mode.
>
>
>
>
> On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell 
> wrote:
>
> > I'll kick it off with a +1
> >
> > On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell 
> > wrote:
> > > Please vote on releasing the following candidate as Apache Spark
> version
> > 1.1.0!
> > >
> > > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
> > >
> > > The release files, including signatures, digests, etc. can be found at:
> > > http://people.apache.org/~pwendell/spark-1.1.0-rc4/
> > >
> > > Release artifacts are signed with the following key:
> > > https://people.apache.org/keys/committer/pwendell.asc
> > >
> > > The staging repository for this release can be found at:
> > >
> https://repository.apache.org/content/repositories/orgapachespark-1031/
> > >
> > > The documentation corresponding to this release can be found at:
> > > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
> > >
> > > Please vote on releasing this package as Apache Spark 1.1.0!
> > >
> > > The vote is open until Saturday, September 06, at 08:30 UTC and passes
> if
> > > a majority of at least 3 +1 PMC votes are cast.
> > >
> > > [ ] +1 Release this package as Apache Spark 1.1.0
> > > [ ] -1 Do not release this package because ...
> > >
> > > To learn more about Apache Spark, please see
> > > http://spark.apache.org/
> > >
> > > == Regressions fixed since RC3 ==
> > > SPARK-3332 - Issue with tagging in EC2 scripts
> > > SPARK-3358 - Issue with regression for m3.XX instances
> > >
> > > == What justifies a -1 vote for this release? ==
> > > This vote is happening very late into the QA period compared with
> > > previous votes, so -1 votes should only occur for significant
> > > regressions from 1.0.2. Bugs already present in 1.0.X will not block
> > > this release.
> > >
> > > == What default changes should I be aware of? ==
> > > 1. The default value of "spark.io.compression.codec" is now "snappy"
> > > --> Old behavior can be restored by switching to "lzf"
> > >
> > > 2. PySpark now performs external spilling during aggregations.
> > > --> Old behavior can be restored by setting "spark.shuffle.spill" to
> > "false".
> > >
> > > 3. PySpark uses a new heuristic for determining the parallelism of
> > > shuffle operations.
> > > --> Old behavior can be restored by setting
> > > "spark.default.parallelism" to the number of cores in the cluster.
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > For additional commands, e-mail: dev-h...@spark.apache.org
> >
> >
>


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Reynold Xin
+1

Tested locally on Mac OS X with local-cluster mode.




On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell  wrote:

> I'll kick it off with a +1
>
> On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell 
> wrote:
> > Please vote on releasing the following candidate as Apache Spark version
> 1.1.0!
> >
> > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> >
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
> >
> > The release files, including signatures, digests, etc. can be found at:
> > http://people.apache.org/~pwendell/spark-1.1.0-rc4/
> >
> > Release artifacts are signed with the following key:
> > https://people.apache.org/keys/committer/pwendell.asc
> >
> > The staging repository for this release can be found at:
> > https://repository.apache.org/content/repositories/orgapachespark-1031/
> >
> > The documentation corresponding to this release can be found at:
> > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
> >
> > Please vote on releasing this package as Apache Spark 1.1.0!
> >
> > The vote is open until Saturday, September 06, at 08:30 UTC and passes if
> > a majority of at least 3 +1 PMC votes are cast.
> >
> > [ ] +1 Release this package as Apache Spark 1.1.0
> > [ ] -1 Do not release this package because ...
> >
> > To learn more about Apache Spark, please see
> > http://spark.apache.org/
> >
> > == Regressions fixed since RC3 ==
> > SPARK-3332 - Issue with tagging in EC2 scripts
> > SPARK-3358 - Issue with regression for m3.XX instances
> >
> > == What justifies a -1 vote for this release? ==
> > This vote is happening very late into the QA period compared with
> > previous votes, so -1 votes should only occur for significant
> > regressions from 1.0.2. Bugs already present in 1.0.X will not block
> > this release.
> >
> > == What default changes should I be aware of? ==
> > 1. The default value of "spark.io.compression.codec" is now "snappy"
> > --> Old behavior can be restored by switching to "lzf"
> >
> > 2. PySpark now performs external spilling during aggregations.
> > --> Old behavior can be restored by setting "spark.shuffle.spill" to
> "false".
> >
> > 3. PySpark uses a new heuristic for determining the parallelism of
> > shuffle operations.
> > --> Old behavior can be restored by setting
> > "spark.default.parallelism" to the number of cores in the cluster.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Patrick Wendell
I'll kick it off with a +1

On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell  wrote:
> Please vote on releasing the following candidate as Apache Spark version 
> 1.1.0!
>
> The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
>
> The release files, including signatures, digests, etc. can be found at:
> http://people.apache.org/~pwendell/spark-1.1.0-rc4/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1031/
>
> The documentation corresponding to this release can be found at:
> http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
>
> Please vote on releasing this package as Apache Spark 1.1.0!
>
> The vote is open until Saturday, September 06, at 08:30 UTC and passes if
> a majority of at least 3 +1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 1.1.0
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see
> http://spark.apache.org/
>
> == Regressions fixed since RC3 ==
> SPARK-3332 - Issue with tagging in EC2 scripts
> SPARK-3358 - Issue with regression for m3.XX instances
>
> == What justifies a -1 vote for this release? ==
> This vote is happening very late into the QA period compared with
> previous votes, so -1 votes should only occur for significant
> regressions from 1.0.2. Bugs already present in 1.0.X will not block
> this release.
>
> == What default changes should I be aware of? ==
> 1. The default value of "spark.io.compression.codec" is now "snappy"
> --> Old behavior can be restored by switching to "lzf"
>
> 2. PySpark now performs external spilling during aggregations.
> --> Old behavior can be restored by setting "spark.shuffle.spill" to "false".
>
> 3. PySpark uses a new heuristic for determining the parallelism of
> shuffle operations.
> --> Old behavior can be restored by setting
> "spark.default.parallelism" to the number of cores in the cluster.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



[VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-03 Thread Patrick Wendell
Please vote on releasing the following candidate as Apache Spark version 1.1.0!

The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1031/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/

Please vote on releasing this package as Apache Spark 1.1.0!

The vote is open until Saturday, September 06, at 08:30 UTC and passes if
a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== Regressions fixed since RC3 ==
SPARK-3332 - Issue with tagging in EC2 scripts
SPARK-3358 - Issue with regression for m3.XX instances

== What justifies a -1 vote for this release? ==
This vote is happening very late into the QA period compared with
previous votes, so -1 votes should only occur for significant
regressions from 1.0.2. Bugs already present in 1.0.X will not block
this release.

== What default changes should I be aware of? ==
1. The default value of "spark.io.compression.codec" is now "snappy"
--> Old behavior can be restored by switching to "lzf"

2. PySpark now performs external spilling during aggregations.
--> Old behavior can be restored by setting "spark.shuffle.spill" to "false".

3. PySpark uses a new heuristic for determining the parallelism of
shuffle operations.
--> Old behavior can be restored by setting
"spark.default.parallelism" to the number of cores in the cluster.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org