subject:"\[VOTE\] Release Apache Spark 1.2.0 \(RC2\)"

Re: [RESULT] [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-18 Thread Patrick Wendell

Update: An Apache infrastructure issue prevented me from pushing this
last night. The issue was resolved today and I should be able to push
the final release artifacts tonight.

On Tue, Dec 16, 2014 at 9:20 PM, Patrick Wendell pwend...@gmail.com wrote:
 This vote has PASSED with 12 +1 votes (8 binding) and no 0 or -1 votes:

 +1:
 Matei Zaharia*
 Madhu Siddalingaiah
 Reynold Xin*
 Sandy Ryza
 Josh Rozen*
 Mark Hamstra*
 Denny Lee
 Tom Graves*
 GuiQiang Li
 Nick Pentreath*
 Sean McNamara*
 Patrick Wendell*

 0:

 -1:

 I'll finalize and package this release in the next 48 hours. Thanks to
 everyone who contributed.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Fwd: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-17 Thread Krishna Sankar

Forgot Reply To All ;o(
-- Forwarded message --
From: Krishna Sankar ksanka...@gmail.com
Date: Wed, Dec 10, 2014 at 9:16 PM
Subject: Re: [VOTE] Release Apache Spark 1.2.0 (RC2)
To: Matei Zaharia matei.zaha...@gmail.com

+1
Works same as RC1
1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4
-Dhadoop.version=2.4.0 -DskipTests clean package 13:07 min
2. Tested pyspark, mlib - running as well as compare results with 1.1.x
2.1. statistics OK
2.2. Linear/Ridge/Laso Regression OK
   Slight difference in the print method (vs. 1.1.x) of the model
object - with a label  more details. This is good.
2.3. Decision Tree, Naive Bayes OK
   Changes in print(model) - now print (model.ToDebugString()) - OK
   Some changes in NaiveBayes. Different from my 1.1.x code - had to
flatten list structures, zip required same number in partitions
   After code changes ran fine.
2.4. KMeans OK
   Center And Scale OK
   zip occasionally fails with error localhost):
org.apache.spark.SparkException: Can only zip RDDs with same number of
elements in each partition
Has https://issues.apache.org/jira/browse/SPARK-2251 reappeared ?
Made it work by doing a different transformation ie reusing an original
rdd.
(Xiangrui, I will end you the iPython Notebook  the dataset by a separate
e-mail)
2.5. rdd operations OK
   State of the Union Texts - MapReduce, Filter,sortByKey (word count)
2.6. recommendation OK
2.7. Good work ! In 1.x.x, had a map distinct over the movielens medium
dataset which never worked. Works fine in 1.2.0 !
3. Scala Mlib - subset of examples as in #2 above, with Scala
3.1. statistics OK
3.2. Linear Regression OK
3.3. Decision Tree OK
3.4. KMeans OK
Cheers
k/

On Wed, Dec 10, 2014 at 3:05 PM, Matei Zaharia matei.zaha...@gmail.com
wrote:

 +1

 Tested on Mac OS X.

 Matei

  On Dec 10, 2014, at 1:08 PM, Patrick Wendell pwend...@gmail.com wrote:
 
  Please vote on releasing the following candidate as Apache Spark version
 1.2.0!
 
  The tag to be voted on is v1.2.0-rc2 (commit a428c446e2):
 
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=a428c446e23e628b746e0626cc02b7b3cadf588e
 
  The release files, including signatures, digests, etc. can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc2/
 
  Release artifacts are signed with the following key:
  https://people.apache.org/keys/committer/pwendell.asc
 
  The staging repository for this release can be found at:
  https://repository.apache.org/content/repositories/orgapachespark-1055/
 
  The documentation corresponding to this release can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc2-docs/
 
  Please vote on releasing this package as Apache Spark 1.2.0!
 
  The vote is open until Saturday, December 13, at 21:00 UTC and passes
  if a majority of at least 3 +1 PMC votes are cast.
 
  [ ] +1 Release this package as Apache Spark 1.2.0
  [ ] -1 Do not release this package because ...
 
  To learn more about Apache Spark, please see
  http://spark.apache.org/
 
  == What justifies a -1 vote for this release? ==
  This vote is happening relatively late into the QA period, so
  -1 votes should only occur for significant regressions from
  1.0.2. Bugs already present in 1.1.X, minor
  regressions, or bugs related to new features will not block this
  release.
 
  == What default changes should I be aware of? ==
  1. The default value of spark.shuffle.blockTransferService has been
  changed to netty
  -- Old behavior can be restored by switching to nio
 
  2. The default value of spark.shuffle.manager has been changed to
 sort.
  -- Old behavior can be restored by setting spark.shuffle.manager to
 hash.
 
  == How does this differ from RC1 ==
  This has fixes for a handful of issues identified - some of the
  notable fixes are:
 
  [Core]
  SPARK-4498: Standalone Master can fail to recognize completed/failed
  applications
 
  [SQL]
  SPARK-4552: Query for empty parquet table in spark sql hive get
  IllegalArgumentException
  SPARK-4753: Parquet2 does not prune based on OR filters on partition
 columns
  SPARK-4761: With JDBC server, set Kryo as default serializer and
  disable reference tracking
  SPARK-4785: When called with arguments referring column fields, PMOD
 throws NPE
 
  - Patrick
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 


 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org

[RESULT] [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-16 Thread Patrick Wendell

This vote has PASSED with 12 +1 votes (8 binding) and no 0 or -1 votes:

+1:
Matei Zaharia*
Madhu Siddalingaiah
Reynold Xin*
Sandy Ryza
Josh Rozen*
Mark Hamstra*
Denny Lee
Tom Graves*
GuiQiang Li
Nick Pentreath*
Sean McNamara*
Patrick Wendell*

0:

-1:

I'll finalize and package this release in the next 48 hours. Thanks to
everyone who contributed.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-16 Thread Patrick Wendell

I'm closing this vote now, will send results in a new thread.

On Sat, Dec 13, 2014 at 12:47 PM, Sean McNamara
sean.mcnam...@webtrends.com wrote:
 +1 tested on OS X and deployed+tested our apps via YARN into our staging 
 cluster.

 Sean


 On Dec 11, 2014, at 10:40 AM, Reynold Xin r...@databricks.com wrote:

 +1

 Tested on OS X.

 On Wednesday, December 10, 2014, Patrick Wendell pwend...@gmail.com wrote:

 Please vote on releasing the following candidate as Apache Spark version
 1.2.0!

 The tag to be voted on is v1.2.0-rc2 (commit a428c446e2):

 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=a428c446e23e628b746e0626cc02b7b3cadf588e

 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc2/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1055/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc2-docs/

 Please vote on releasing this package as Apache Spark 1.2.0!

 The vote is open until Saturday, December 13, at 21:00 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.

 [ ] +1 Release this package as Apache Spark 1.2.0
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 == What justifies a -1 vote for this release? ==
 This vote is happening relatively late into the QA period, so
 -1 votes should only occur for significant regressions from
 1.0.2. Bugs already present in 1.1.X, minor
 regressions, or bugs related to new features will not block this
 release.

 == What default changes should I be aware of? ==
 1. The default value of spark.shuffle.blockTransferService has been
 changed to netty
 -- Old behavior can be restored by switching to nio

 2. The default value of spark.shuffle.manager has been changed to sort.
 -- Old behavior can be restored by setting spark.shuffle.manager to
 hash.

 == How does this differ from RC1 ==
 This has fixes for a handful of issues identified - some of the
 notable fixes are:

 [Core]
 SPARK-4498: Standalone Master can fail to recognize completed/failed
 applications

 [SQL]
 SPARK-4552: Query for empty parquet table in spark sql hive get
 IllegalArgumentException
 SPARK-4753: Parquet2 does not prune based on OR filters on partition
 columns
 SPARK-4761: With JDBC server, set Kryo as default serializer and
 disable reference tracking
 SPARK-4785: When called with arguments referring column fields, PMOD
 throws NPE

 - Patrick

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org javascript:;
 For additional commands, e-mail: dev-h...@spark.apache.org javascript:;




 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-13 Thread Tom Graves

+1 built and tested on Yarn on Hadoop 2.x cluster.
Tom 

 On Saturday, December 13, 2014 12:48 AM, Denny Lee denny.g@gmail.com 
wrote:
   

 +1 Tested on OSX

Tested Scala 2.10.3, SparkSQL with Hive 0.12 / Hadoop 2.5, Thrift Server,
MLLib SVD


On Fri Dec 12 2014 at 8:57:16 PM Mark Hamstra m...@clearstorydata.com
wrote:

 +1

 On Fri, Dec 12, 2014 at 8:00 PM, Josh Rosen rosenvi...@gmail.com wrote:
 
  +1.  Tested using spark-perf and the Spark EC2 scripts.  I didn’t notice
  any performance regressions that could not be attributed to changes of
  default configurations.  To be more specific, when running Spark 1.2.0
 with
  the Spark 1.1.0 settings of spark.shuffle.manager=hash and
  spark.shuffle.blockTransferService=nio, there was no performance
 regression
  and, in fact, there were significant performance improvements for some
  workloads.
 
  In Spark 1.2.0, the new default settings are spark.shuffle.manager=sort
  and spark.shuffle.blockTransferService=netty.  With these new settings,
 I
  noticed a performance regression in the scala-sort-by-key-int spark-perf
  test.  However, Spark 1.1.0 and 1.1.1 exhibit a similar performance
  regression for that same test when run with spark.shuffle.manager=sort,
 so
  this regression seems explainable by the change of defaults.  Besides
 this,
  most of the other tests ran at the same speeds or faster with the new
 1.2.0
  defaults.  Also, keep in mind that this is a somewhat artificial micro
  benchmark; I have heard anecdotal reports from many users that their real
  workloads have run faster with 1.2.0.
 
  Based on these results, I’m comfortable giving a +1 on 1.2.0 RC2.
 
  - Josh
 
  On December 11, 2014 at 9:52:39 AM, Sandy Ryza (sandy.r...@cloudera.com)
  wrote:
 
  +1 (non-binding). Tested on Ubuntu against YARN.
 
  On Thu, Dec 11, 2014 at 9:38 AM, Reynold Xin r...@databricks.com
 wrote:
 
   +1
  
   Tested on OS X.
  
   On Wednesday, December 10, 2014, Patrick Wendell pwend...@gmail.com
   wrote:
  
Please vote on releasing the following candidate as Apache Spark
  version
1.2.0!
   
The tag to be voted on is v1.2.0-rc2 (commit a428c446e2):
   
   
  
  https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=
 a428c446e23e628b746e0626cc02b7b3cadf588e
   
The release files, including signatures, digests, etc. can be found
 at:
http://people.apache.org/~pwendell/spark-1.2.0-rc2/
   
Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc
   
The staging repository for this release can be found at:
   
  https://repository.apache.org/content/repositories/orgapachespark-1055/
   
The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-rc2-docs/
   
Please vote on releasing this package as Apache Spark 1.2.0!
   
The vote is open until Saturday, December 13, at 21:00 UTC and passes
if a majority of at least 3 +1 PMC votes are cast.
   
[ ] +1 Release this package as Apache Spark 1.2.0
[ ] -1 Do not release this package because ...
   
To learn more about Apache Spark, please see
http://spark.apache.org/
   
== What justifies a -1 vote for this release? ==
This vote is happening relatively late into the QA period, so
-1 votes should only occur for significant regressions from
1.0.2. Bugs already present in 1.1.X, minor
regressions, or bugs related to new features will not block this
release.
   
== What default changes should I be aware of? ==
1. The default value of spark.shuffle.blockTransferService has
 been
changed to netty
-- Old behavior can be restored by switching to nio
   
2. The default value of spark.shuffle.manager has been changed to
   sort.
-- Old behavior can be restored by setting spark.shuffle.manager
 to
hash.
   
== How does this differ from RC1 ==
This has fixes for a handful of issues identified - some of the
notable fixes are:
   
[Core]
SPARK-4498: Standalone Master can fail to recognize completed/failed
applications
   
[SQL]
SPARK-4552: Query for empty parquet table in spark sql hive get
IllegalArgumentException
SPARK-4753: Parquet2 does not prune based on OR filters on partition
columns
SPARK-4761: With JDBC server, set Kryo as default serializer and
disable reference tracking
SPARK-4785: When called with arguments referring column fields, PMOD
throws NPE
   
- Patrick
   

 -
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  javascript:;
For additional commands, e-mail: dev-h...@spark.apache.org
   javascript:;

Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-13 Thread GuoQiang Li

+1 (non-binding).  Tested on CentOS 6.4


-- Original --
From:  Patrick Wendell;pwend...@gmail.com;
Date:  Thu, Dec 11, 2014 05:08 AM
To:  dev@spark.apache.orgdev@spark.apache.org;


Subject:  [VOTE] Release Apache Spark 1.2.0 (RC2)



Please vote on releasing the following candidate as Apache Spark version 1.2.0!

The tag to be voted on is v1.2.0-rc2 (commit a428c446e2):
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=a428c446e23e628b746e0626cc02b7b3cadf588e

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-rc2/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1055/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-rc2-docs/

Please vote on releasing this package as Apache Spark 1.2.0!

The vote is open until Saturday, December 13, at 21:00 UTC and passes
if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.2.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== What justifies a -1 vote for this release? ==
This vote is happening relatively late into the QA period, so
-1 votes should only occur for significant regressions from
1.0.2. Bugs already present in 1.1.X, minor
regressions, or bugs related to new features will not block this
release.

== What default changes should I be aware of? ==
1. The default value of spark.shuffle.blockTransferService has been
changed to netty
-- Old behavior can be restored by switching to nio

2. The default value of spark.shuffle.manager has been changed to sort.
-- Old behavior can be restored by setting spark.shuffle.manager to hash.

== How does this differ from RC1 ==
This has fixes for a handful of issues identified - some of the
notable fixes are:

[Core]
SPARK-4498: Standalone Master can fail to recognize completed/failed
applications

[SQL]
SPARK-4552: Query for empty parquet table in spark sql hive get
IllegalArgumentException
SPARK-4753: Parquet2 does not prune based on OR filters on partition columns
SPARK-4761: With JDBC server, set Kryo as default serializer and
disable reference tracking
SPARK-4785: When called with arguments referring column fields, PMOD throws NPE

- Patrick

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-13 Thread Nick Pentreath

+1

—
Sent from Mailbox

On Sat, Dec 13, 2014 at 3:12 PM, GuoQiang Li wi...@qq.com wrote:

 +1 (non-binding).  Tested on CentOS 6.4
 -- Original --
 From:  Patrick Wendell;pwend...@gmail.com;
 Date:  Thu, Dec 11, 2014 05:08 AM
 To:  dev发送@spark.apache.orgdev@spark.apache.org;
 Subject:  [VOTE] Release Apache Spark 1.2.0 (RC2)
 Please vote on releasing the following candidate as Apache Spark version 
 1.2.0!
 The tag to be voted on is v1.2.0-rc2 (commit a428c446e2):
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=a428c446e23e628b746e0626cc02b7b3cadf588e
 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc2/
 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc
 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1055/
 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc2-docs/
 Please vote on releasing this package as Apache Spark 1.2.0!
 The vote is open until Saturday, December 13, at 21:00 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.
 [ ] +1 Release this package as Apache Spark 1.2.0
 [ ] -1 Do not release this package because ...
 To learn more about Apache Spark, please see
 http://spark.apache.org/
 == What justifies a -1 vote for this release? ==
 This vote is happening relatively late into the QA period, so
 -1 votes should only occur for significant regressions from
 1.0.2. Bugs already present in 1.1.X, minor
 regressions, or bugs related to new features will not block this
 release.
 == What default changes should I be aware of? ==
 1. The default value of spark.shuffle.blockTransferService has been
 changed to netty
 -- Old behavior can be restored by switching to nio
 2. The default value of spark.shuffle.manager has been changed to sort.
 -- Old behavior can be restored by setting spark.shuffle.manager to hash.
 == How does this differ from RC1 ==
 This has fixes for a handful of issues identified - some of the
 notable fixes are:
 [Core]
 SPARK-4498: Standalone Master can fail to recognize completed/failed
 applications
 [SQL]
 SPARK-4552: Query for empty parquet table in spark sql hive get
 IllegalArgumentException
 SPARK-4753: Parquet2 does not prune based on OR filters on partition columns
 SPARK-4761: With JDBC server, set Kryo as default serializer and
 disable reference tracking
 SPARK-4785: When called with arguments referring column fields, PMOD throws 
 NPE
 - Patrick
 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-13 Thread slcclimber

I am building and testing using sbt.
I get a lot of 
Job aborted due to stage failure: Master removed our application: FAILED
did not contain cancelled, and Job aborted due to stage failure: Master
removed our application: FAILED did not contain killed
errors trying to run tests.  (JobCancellationSuite.scala:236)
I have never experienced this before so it is concerning.

I  was able to successfully run all the python examples for spark and Mllib
successfully.




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-2-0-RC2-tp9713p9770.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-13 Thread Sean McNamara

+1 tested on OS X and deployed+tested our apps via YARN into our staging 
cluster.

Sean


 On Dec 11, 2014, at 10:40 AM, Reynold Xin r...@databricks.com wrote:
 
 +1
 
 Tested on OS X.
 
 On Wednesday, December 10, 2014, Patrick Wendell pwend...@gmail.com wrote:
 
 Please vote on releasing the following candidate as Apache Spark version
 1.2.0!
 
 The tag to be voted on is v1.2.0-rc2 (commit a428c446e2):
 
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=a428c446e23e628b746e0626cc02b7b3cadf588e
 
 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc2/
 
 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc
 
 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1055/
 
 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc2-docs/
 
 Please vote on releasing this package as Apache Spark 1.2.0!
 
 The vote is open until Saturday, December 13, at 21:00 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.
 
 [ ] +1 Release this package as Apache Spark 1.2.0
 [ ] -1 Do not release this package because ...
 
 To learn more about Apache Spark, please see
 http://spark.apache.org/
 
 == What justifies a -1 vote for this release? ==
 This vote is happening relatively late into the QA period, so
 -1 votes should only occur for significant regressions from
 1.0.2. Bugs already present in 1.1.X, minor
 regressions, or bugs related to new features will not block this
 release.
 
 == What default changes should I be aware of? ==
 1. The default value of spark.shuffle.blockTransferService has been
 changed to netty
 -- Old behavior can be restored by switching to nio
 
 2. The default value of spark.shuffle.manager has been changed to sort.
 -- Old behavior can be restored by setting spark.shuffle.manager to
 hash.
 
 == How does this differ from RC1 ==
 This has fixes for a handful of issues identified - some of the
 notable fixes are:
 
 [Core]
 SPARK-4498: Standalone Master can fail to recognize completed/failed
 applications
 
 [SQL]
 SPARK-4552: Query for empty parquet table in spark sql hive get
 IllegalArgumentException
 SPARK-4753: Parquet2 does not prune based on OR filters on partition
 columns
 SPARK-4761: With JDBC server, set Kryo as default serializer and
 disable reference tracking
 SPARK-4785: When called with arguments referring column fields, PMOD
 throws NPE
 
 - Patrick
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org javascript:;
 For additional commands, e-mail: dev-h...@spark.apache.org javascript:;
 
 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-12 Thread Josh Rosen

+1. Tested using spark-perf and the Spark EC2 scripts. I didn’t notice any
performance regressions that could not be attributed to changes of default
configurations. To be more specific, when running Spark 1.2.0 with the Spark
1.1.0 settings of spark.shuffle.manager=hash and
spark.shuffle.blockTransferService=nio, there was no performance regression
and, in fact, there were significant performance improvements for some
workloads.

In Spark 1.2.0, the new default settings are spark.shuffle.manager=sort and
spark.shuffle.blockTransferService=netty. With these new settings, I noticed a
performance regression in the scala-sort-by-key-int spark-perf test. However,
Spark 1.1.0 and 1.1.1 exhibit a similar performance regression for that same
test when run with spark.shuffle.manager=sort, so this regression seems
explainable by the change of defaults. Besides this, most of the other tests
ran at the same speeds or faster with the new 1.2.0 defaults. Also, keep in
mind that this is a somewhat artificial micro benchmark; I have heard anecdotal
reports from many users that their real workloads have run faster with 1.2.0.

Based on these results, I’m comfortable giving a +1 on 1.2.0 RC2.

- Josh

On December 11, 2014 at 9:52:39 AM, Sandy Ryza (sandy.r...@cloudera.com) wrote:

+1 (non-binding). Tested on Ubuntu against YARN.

On Thu, Dec 11, 2014 at 9:38 AM, Reynold Xin r...@databricks.com wrote:

Tested on OS X.

On Wednesday, December 10, 2014, Patrick Wendell pwend...@gmail.com
wrote:

Please vote on releasing the following candidate as Apache Spark version
1.2.0!

The tag to be voted on is v1.2.0-rc2 (commit a428c446e2):

https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=a428c446e23e628b746e0626cc02b7b3cadf588e

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-rc2/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1055/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-rc2-docs/

Please vote on releasing this package as Apache Spark 1.2.0!

The vote is open until Saturday, December 13, at 21:00 UTC and passes
if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.2.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== What justifies a -1 vote for this release? ==
This vote is happening relatively late into the QA period, so
-1 votes should only occur for significant regressions from
1.0.2. Bugs already present in 1.1.X, minor
regressions, or bugs related to new features will not block this
release.

== What default changes should I be aware of? ==
1. The default value of spark.shuffle.blockTransferService has been
changed to netty
-- Old behavior can be restored by switching to nio

2. The default value of spark.shuffle.manager has been changed to
sort.
-- Old behavior can be restored by setting spark.shuffle.manager to
hash.

== How does this differ from RC1 ==
This has fixes for a handful of issues identified - some of the
notable fixes are:

[Core]
SPARK-4498: Standalone Master can fail to recognize completed/failed
applications

[SQL]
SPARK-4552: Query for empty parquet table in spark sql hive get
IllegalArgumentException
SPARK-4753: Parquet2 does not prune based on OR filters on partition
columns
SPARK-4761: With JDBC server, set Kryo as default serializer and
disable reference tracking
SPARK-4785: When called with arguments referring column fields, PMOD
throws NPE

- Patrick

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org javascript:;
For additional commands, e-mail: dev-h...@spark.apache.org
javascript:;

Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-12 Thread Mark Hamstra

On Fri, Dec 12, 2014 at 8:00 PM, Josh Rosen rosenvi...@gmail.com wrote:

+1. Tested using spark-perf and the Spark EC2 scripts. I didn’t notice
any performance regressions that could not be attributed to changes of
default configurations. To be more specific, when running Spark 1.2.0 with
the Spark 1.1.0 settings of spark.shuffle.manager=hash and
spark.shuffle.blockTransferService=nio, there was no performance regression
and, in fact, there were significant performance improvements for some
workloads.

In Spark 1.2.0, the new default settings are spark.shuffle.manager=sort
and spark.shuffle.blockTransferService=netty. With these new settings, I
noticed a performance regression in the scala-sort-by-key-int spark-perf
test. However, Spark 1.1.0 and 1.1.1 exhibit a similar performance
regression for that same test when run with spark.shuffle.manager=sort, so
this regression seems explainable by the change of defaults. Besides this,
most of the other tests ran at the same speeds or faster with the new 1.2.0
defaults. Also, keep in mind that this is a somewhat artificial micro
benchmark; I have heard anecdotal reports from many users that their real
workloads have run faster with 1.2.0.

Based on these results, I’m comfortable giving a +1 on 1.2.0 RC2.

- Josh

On December 11, 2014 at 9:52:39 AM, Sandy Ryza (sandy.r...@cloudera.com)
wrote:

+1 (non-binding). Tested on Ubuntu against YARN.

On Thu, Dec 11, 2014 at 9:38 AM, Reynold Xin r...@databricks.com wrote:

Tested on OS X.

On Wednesday, December 10, 2014, Patrick Wendell pwend...@gmail.com
wrote:

Please vote on releasing the following candidate as Apache Spark
version
1.2.0!

The tag to be voted on is v1.2.0-rc2 (commit a428c446e2):

https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=a428c446e23e628b746e0626cc02b7b3cadf588e

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-rc2/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:

https://repository.apache.org/content/repositories/orgapachespark-1055/