Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-15 Thread Xiangrui Meng
Hi Krishna,

Thanks for providing the notebook! I tried and found that the problem
is with PySpark's zip. I created a JIRA to track the issue:
https://issues.apache.org/jira/browse/SPARK-4841

-Xiangrui

On Thu, Dec 11, 2014 at 1:55 PM, Krishna Sankar ksanka...@gmail.com wrote:
 K-Means iPython notebook  data attached.
 It is the zip that gives the error ; while one of the RDDs is from the
 prediction, most probably there is no problem with the K-Means.
 Lines 34,35  36 essentially are the same. But only 36 works with 1.2.0.
 Interestingly, lines 34,35  36 work with 1.1.1 (Checked just now)

 The plot thickens!
 In 1.1.1, freq_cluster_map.take(5) prints normally for 34  35, but in
 exponential form for 36. So there is some difference even in 1.1.1.
 #34,#35 [(array([28143, 0, 174, 1, 0, 0, 7000]), 1),

  (array([19244, 0,   215, 2, 0, 0,  6968]), 1),
  (array([41354, 0,  4123, 4, 0, 0,  7034]), 1),
  (array([14776, 0,   500, 1, 0, 0,  6952]), 1),
  (array([97752, 0, 43300,26,  2077, 4,  6935]), 0)]

 #36 [(array([  2.8143e+04,   0.e+00,   1.7400e+02,

1.e+00,   0.e+00,   0.e+00,
7.e+03]), 1),
  (array([  1.9244e+04,   0.e+00,   2.1500e+02,
2.e+00,   0.e+00,   0.e+00,
6.9680e+03]), 1),
  (array([  4.1354e+04,   0.e+00,   4.1230e+03,
4.e+00,   0.e+00,   0.e+00,
7.0340e+03]), 1),
  (array([  1.4776e+04,   0.e+00,   5.e+02,
1.e+00,   0.e+00,   0.e+00,
6.9520e+03]), 1),
  (array([  9.7752e+04,   0.e+00,   4.3300e+04,
2.6000e+01,   2.0770e+03,   4.e+00,
6.9350e+03]), 0)]

 I had overwritten the naive bayes example. Will chase the older versions
 down

 Cheers
 k/

 On Wed, Dec 3, 2014 at 4:19 PM, Xiangrui Meng men...@gmail.com wrote:

 Krishna, could you send me some code snippets for the issues you saw
 in naive Bayes and k-means? -Xiangrui

 On Sun, Nov 30, 2014 at 6:49 AM, Krishna Sankar ksanka...@gmail.com
 wrote:
  +1
  1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4
  -Dhadoop.version=2.4.0 -DskipTests clean package 16:46 min (slightly
  slower
  connection)
  2. Tested pyspark, mlib - running as well as compare esults with 1.1.x
  2.1. statistics OK
  2.2. Linear/Ridge/Laso Regression OK
 Slight difference in the print method (vs. 1.1.x) of the model
  object - with a label  more details. This is good.
  2.3. Decision Tree, Naive Bayes OK
 Changes in print(model) - now print (model.ToDebugString()) - OK
 Some changes in NaiveBayes. Different from my 1.1.x code - had to
  flatten list structures, zip required same number in partitions
 After code changes ran fine.
  2.4. KMeans OK
 zip occasionally fails with error localhost):
  org.apache.spark.SparkException: Can only zip RDDs with same number of
  elements in each partition
  Has https://issues.apache.org/jira/browse/SPARK-2251 reappeared ?
  Made it work by doing a different transformation ie reusing an original
  rdd.
  2.5. rdd operations OK
 State of the Union Texts - MapReduce, Filter,sortByKey (word
  count)
  2.6. recommendation OK
  2.7. Good work ! In 1.x.x, had a map distinct over the movielens medium
  dataset which never worked. Works fine in 1.2.0 !
  3. Scala Mlib - subset of examples as in #2 above, with Scala
  3.1. statistics OK
  3.2. Linear Regression OK
  3.3. Decision Tree OK
  3.4. KMeans OK
  Cheers
  k/
  P.S: Plan to add RF and .ml mechanics to this bank
 
  On Fri, Nov 28, 2014 at 9:16 PM, Patrick Wendell pwend...@gmail.com
  wrote:
 
  Please vote on releasing the following candidate as Apache Spark
  version
  1.2.0!
 
  The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
 
 
  https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb
 
  The release files, including signatures, digests, etc. can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc1/
 
  Release artifacts are signed with the following key:
  https://people.apache.org/keys/committer/pwendell.asc
 
  The staging repository for this release can be found at:
  https://repository.apache.org/content/repositories/orgapachespark-1048/
 
  The documentation corresponding to this release can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/
 
  Please vote on releasing this package as Apache Spark 1.2.0!
 
  The vote is open until Tuesday, December 02, at 05:15 UTC and passes
  if a majority of at least 3 +1 PMC votes are cast.
 
  [ ] +1 Release this package as Apache Spark 1.1.0
  [ ] -1 Do not release this package because ...
 
  To learn more about Apache Spark, please see
  http://spark.apache.org/
 
  == What justifies a -1 vote 

[RESULT] [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-10 Thread Patrick Wendell
This vote is closed in favor of RC2.

On Fri, Dec 5, 2014 at 2:02 PM, Patrick Wendell pwend...@gmail.com wrote:
 Hey All,

 Thanks all for the continued testing!

 The issue I mentioned earlier SPARK-4498 was fixed earlier this week
 (hat tip to Mark Hamstra who contributed to fix).

 In the interim a few smaller blocker-level issues with Spark SQL were
 found and fixed (SPARK-4753, SPARK-4552, SPARK-4761).

 There is currently an outstanding issue (SPARK-4740[1]) in Spark core
 that needs to be fixed.

 I want to thank in particular Shopify and Intel China who have
 identified and helped test blocker issues with the release. This type
 of workload testing around releases is really helpful for us.

 Once things stabilize I will cut RC2. I think we're pretty close with this 
 one.

 - Patrick

 On Wed, Dec 3, 2014 at 5:38 PM, Takeshi Yamamuro linguin@gmail.com 
 wrote:
 +1 (non-binding)

 Checked on CentOS 6.5, compiled from the source.
 Ran various examples in stand-alone master and three slaves, and
 browsed the web UI.

 On Sat, Nov 29, 2014 at 2:16 PM, Patrick Wendell pwend...@gmail.com wrote:

 Please vote on releasing the following candidate as Apache Spark version
 1.2.0!

 The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):

 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb

 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1048/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/

 Please vote on releasing this package as Apache Spark 1.2.0!

 The vote is open until Tuesday, December 02, at 05:15 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.

 [ ] +1 Release this package as Apache Spark 1.1.0
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 == What justifies a -1 vote for this release? ==
 This vote is happening very late into the QA period compared with
 previous votes, so -1 votes should only occur for significant
 regressions from 1.0.2. Bugs already present in 1.1.X, minor
 regressions, or bugs related to new features will not block this
 release.

 == What default changes should I be aware of? ==
 1. The default value of spark.shuffle.blockTransferService has been
 changed to netty
 -- Old behavior can be restored by switching to nio

 2. The default value of spark.shuffle.manager has been changed to sort.
 -- Old behavior can be restored by setting spark.shuffle.manager to
 hash.

 == Other notes ==
 Because this vote is occurring over a weekend, I will likely extend
 the vote if this RC survives until the end of the vote period.

 - Patrick

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-05 Thread Patrick Wendell
Hey All,

Thanks all for the continued testing!

The issue I mentioned earlier SPARK-4498 was fixed earlier this week
(hat tip to Mark Hamstra who contributed to fix).

In the interim a few smaller blocker-level issues with Spark SQL were
found and fixed (SPARK-4753, SPARK-4552, SPARK-4761).

There is currently an outstanding issue (SPARK-4740[1]) in Spark core
that needs to be fixed.

I want to thank in particular Shopify and Intel China who have
identified and helped test blocker issues with the release. This type
of workload testing around releases is really helpful for us.

Once things stabilize I will cut RC2. I think we're pretty close with this one.

- Patrick

On Wed, Dec 3, 2014 at 5:38 PM, Takeshi Yamamuro linguin@gmail.com wrote:
 +1 (non-binding)

 Checked on CentOS 6.5, compiled from the source.
 Ran various examples in stand-alone master and three slaves, and
 browsed the web UI.

 On Sat, Nov 29, 2014 at 2:16 PM, Patrick Wendell pwend...@gmail.com wrote:

 Please vote on releasing the following candidate as Apache Spark version
 1.2.0!

 The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):

 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb

 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1048/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/

 Please vote on releasing this package as Apache Spark 1.2.0!

 The vote is open until Tuesday, December 02, at 05:15 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.

 [ ] +1 Release this package as Apache Spark 1.1.0
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 == What justifies a -1 vote for this release? ==
 This vote is happening very late into the QA period compared with
 previous votes, so -1 votes should only occur for significant
 regressions from 1.0.2. Bugs already present in 1.1.X, minor
 regressions, or bugs related to new features will not block this
 release.

 == What default changes should I be aware of? ==
 1. The default value of spark.shuffle.blockTransferService has been
 changed to netty
 -- Old behavior can be restored by switching to nio

 2. The default value of spark.shuffle.manager has been changed to sort.
 -- Old behavior can be restored by setting spark.shuffle.manager to
 hash.

 == Other notes ==
 Because this vote is occurring over a weekend, I will likely extend
 the vote if this RC survives until the end of the vote period.

 - Patrick

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-04 Thread Takeshi Yamamuro
+1 (non-binding)

Checked on CentOS 6.5, compiled from the source.
Ran various examples in stand-alone master and three slaves, and
browsed the web UI.

On Sat, Nov 29, 2014 at 2:16 PM, Patrick Wendell pwend...@gmail.com wrote:

 Please vote on releasing the following candidate as Apache Spark version
 1.2.0!

 The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):

 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb

 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1048/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/

 Please vote on releasing this package as Apache Spark 1.2.0!

 The vote is open until Tuesday, December 02, at 05:15 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.

 [ ] +1 Release this package as Apache Spark 1.1.0
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 == What justifies a -1 vote for this release? ==
 This vote is happening very late into the QA period compared with
 previous votes, so -1 votes should only occur for significant
 regressions from 1.0.2. Bugs already present in 1.1.X, minor
 regressions, or bugs related to new features will not block this
 release.

 == What default changes should I be aware of? ==
 1. The default value of spark.shuffle.blockTransferService has been
 changed to netty
 -- Old behavior can be restored by switching to nio

 2. The default value of spark.shuffle.manager has been changed to sort.
 -- Old behavior can be restored by setting spark.shuffle.manager to
 hash.

 == Other notes ==
 Because this vote is occurring over a weekend, I will likely extend
 the vote if this RC survives until the end of the vote period.

 - Patrick

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org




Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-04 Thread Krishna Sankar
Will do. Am on the road - will annotate an iPython notebook with what works
 what didn't work ...
Cheers
k/

On Wed, Dec 3, 2014 at 4:19 PM, Xiangrui Meng men...@gmail.com wrote:

 Krishna, could you send me some code snippets for the issues you saw
 in naive Bayes and k-means? -Xiangrui

 On Sun, Nov 30, 2014 at 6:49 AM, Krishna Sankar ksanka...@gmail.com
 wrote:
  +1
  1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4
  -Dhadoop.version=2.4.0 -DskipTests clean package 16:46 min (slightly
 slower
  connection)
  2. Tested pyspark, mlib - running as well as compare esults with 1.1.x
  2.1. statistics OK
  2.2. Linear/Ridge/Laso Regression OK
 Slight difference in the print method (vs. 1.1.x) of the model
  object - with a label  more details. This is good.
  2.3. Decision Tree, Naive Bayes OK
 Changes in print(model) - now print (model.ToDebugString()) - OK
 Some changes in NaiveBayes. Different from my 1.1.x code - had to
  flatten list structures, zip required same number in partitions
 After code changes ran fine.
  2.4. KMeans OK
 zip occasionally fails with error localhost):
  org.apache.spark.SparkException: Can only zip RDDs with same number of
  elements in each partition
  Has https://issues.apache.org/jira/browse/SPARK-2251 reappeared ?
  Made it work by doing a different transformation ie reusing an original
  rdd.
  2.5. rdd operations OK
 State of the Union Texts - MapReduce, Filter,sortByKey (word
 count)
  2.6. recommendation OK
  2.7. Good work ! In 1.x.x, had a map distinct over the movielens medium
  dataset which never worked. Works fine in 1.2.0 !
  3. Scala Mlib - subset of examples as in #2 above, with Scala
  3.1. statistics OK
  3.2. Linear Regression OK
  3.3. Decision Tree OK
  3.4. KMeans OK
  Cheers
  k/
  P.S: Plan to add RF and .ml mechanics to this bank
 
  On Fri, Nov 28, 2014 at 9:16 PM, Patrick Wendell pwend...@gmail.com
 wrote:
 
  Please vote on releasing the following candidate as Apache Spark version
  1.2.0!
 
  The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
 
 
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb
 
  The release files, including signatures, digests, etc. can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc1/
 
  Release artifacts are signed with the following key:
  https://people.apache.org/keys/committer/pwendell.asc
 
  The staging repository for this release can be found at:
  https://repository.apache.org/content/repositories/orgapachespark-1048/
 
  The documentation corresponding to this release can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/
 
  Please vote on releasing this package as Apache Spark 1.2.0!
 
  The vote is open until Tuesday, December 02, at 05:15 UTC and passes
  if a majority of at least 3 +1 PMC votes are cast.
 
  [ ] +1 Release this package as Apache Spark 1.1.0
  [ ] -1 Do not release this package because ...
 
  To learn more about Apache Spark, please see
  http://spark.apache.org/
 
  == What justifies a -1 vote for this release? ==
  This vote is happening very late into the QA period compared with
  previous votes, so -1 votes should only occur for significant
  regressions from 1.0.2. Bugs already present in 1.1.X, minor
  regressions, or bugs related to new features will not block this
  release.
 
  == What default changes should I be aware of? ==
  1. The default value of spark.shuffle.blockTransferService has been
  changed to netty
  -- Old behavior can be restored by switching to nio
 
  2. The default value of spark.shuffle.manager has been changed to
 sort.
  -- Old behavior can be restored by setting spark.shuffle.manager to
  hash.
 
  == Other notes ==
  Because this vote is occurring over a weekend, I will likely extend
  the vote if this RC survives until the end of the vote period.
 
  - Patrick
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 



Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-02 Thread Denny Lee
+1 (non-binding)

Verified on OSX 10.10.2, built from source,
spark-shell / spark-submit jobs
ran various simple Spark / Scala queries
ran various SparkSQL queries (including HiveContext)
ran ThriftServer service and connected via beeline
ran SparkSVD


On Mon Dec 01 2014 at 11:09:26 PM Patrick Wendell pwend...@gmail.com
wrote:

 Hey All,

 Just an update. Josh, Andrew, and others are working to reproduce
 SPARK-4498 and fix it. Other than that issue no serious regressions
 have been reported so far. If we are able to get a fix in for that
 soon, we'll likely cut another RC with the patch.

 Continued testing of RC1 is definitely appreciated!

 I'll leave this vote open to allow folks to continue posting comments.
 It's fine to still give +1 from your own testing... i.e. you can
 assume at this point SPARK-4498 will be fixed before releasing.

 - Patrick

 On Mon, Dec 1, 2014 at 3:30 PM, Matei Zaharia matei.zaha...@gmail.com
 wrote:
  +0.9 from me. Tested it on Mac and Windows (someone has to do it) and
 while things work, I noticed a few recent scripts don't have Windows
 equivalents, namely https://issues.apache.org/jira/browse/SPARK-4683 and
 https://issues.apache.org/jira/browse/SPARK-4684. The first one at least
 would be good to fix if we do another RC. Not blocking the release but
 useful to fix in docs is https://issues.apache.org/jira/browse/SPARK-4685.
 
  Matei
 
 
  On Dec 1, 2014, at 11:18 AM, Josh Rosen rosenvi...@gmail.com wrote:
 
  Hi everyone,
 
  There's an open bug report related to Spark standalone which could be a
 potential release-blocker (pending investigation / a bug fix):
 https://issues.apache.org/jira/browse/SPARK-4498.  This issue seems
 non-deterministc and only affects long-running Spark standalone
 deployments, so it may be hard to reproduce.  I'm going to work on a patch
 to add additional logging in order to help with debugging.
 
  I just wanted to give an early head's up about this issue and to get
 more eyes on it in case anyone else has run into it or wants to help with
 debugging.
 
  - Josh
 
  On November 28, 2014 at 9:18:09 PM, Patrick Wendell (pwend...@gmail.com)
 wrote:
 
  Please vote on releasing the following candidate as Apache Spark
 version 1.2.0!
 
  The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
  https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=
 1056e9ec13203d0c51564265e94d77a054498fdb
 
  The release files, including signatures, digests, etc. can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc1/
 
  Release artifacts are signed with the following key:
  https://people.apache.org/keys/committer/pwendell.asc
 
  The staging repository for this release can be found at:
  https://repository.apache.org/content/repositories/orgapachespark-1048/
 
  The documentation corresponding to this release can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/
 
  Please vote on releasing this package as Apache Spark 1.2.0!
 
  The vote is open until Tuesday, December 02, at 05:15 UTC and passes
  if a majority of at least 3 +1 PMC votes are cast.
 
  [ ] +1 Release this package as Apache Spark 1.1.0
  [ ] -1 Do not release this package because ...
 
  To learn more about Apache Spark, please see
  http://spark.apache.org/
 
  == What justifies a -1 vote for this release? ==
  This vote is happening very late into the QA period compared with
  previous votes, so -1 votes should only occur for significant
  regressions from 1.0.2. Bugs already present in 1.1.X, minor
  regressions, or bugs related to new features will not block this
  release.
 
  == What default changes should I be aware of? ==
  1. The default value of spark.shuffle.blockTransferService has been
  changed to netty
  -- Old behavior can be restored by switching to nio
 
  2. The default value of spark.shuffle.manager has been changed to
 sort.
  -- Old behavior can be restored by setting spark.shuffle.manager to
 hash.
 
  == Other notes ==
  Because this vote is occurring over a weekend, I will likely extend
  the vote if this RC survives until the end of the vote period.
 
  - Patrick
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org




Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-02 Thread Jeremy Freeman
+1 (non-binding)

Installed version pre-built for Hadoop on a private HPC
ran PySpark shell w/ iPython
loaded data using custom Hadoop input formats
ran MLlib routines in PySpark
ran custom workflows in PySpark
browsed the web UI

Noticeable improvements in stability and performance during large shuffles (as 
well as the elimination of frequent but unpredictable “FileNotFound / too many 
open files” errors).

We initially hit errors during large collects that ran fine in 1.1, but setting 
the new spark.driver.maxResultSize to 0 preserved the old behavior. Definitely 
worth highlighting this setting in the release notes, as the new default may be 
too small for some users and workloads.

— Jeremy

-
jeremyfreeman.net
@thefreemanlab

On Dec 2, 2014, at 3:22 AM, Denny Lee denny.g@gmail.com wrote:

 +1 (non-binding)
 
 Verified on OSX 10.10.2, built from source,
 spark-shell / spark-submit jobs
 ran various simple Spark / Scala queries
 ran various SparkSQL queries (including HiveContext)
 ran ThriftServer service and connected via beeline
 ran SparkSVD
 
 
 On Mon Dec 01 2014 at 11:09:26 PM Patrick Wendell pwend...@gmail.com
 wrote:
 
 Hey All,
 
 Just an update. Josh, Andrew, and others are working to reproduce
 SPARK-4498 and fix it. Other than that issue no serious regressions
 have been reported so far. If we are able to get a fix in for that
 soon, we'll likely cut another RC with the patch.
 
 Continued testing of RC1 is definitely appreciated!
 
 I'll leave this vote open to allow folks to continue posting comments.
 It's fine to still give +1 from your own testing... i.e. you can
 assume at this point SPARK-4498 will be fixed before releasing.
 
 - Patrick
 
 On Mon, Dec 1, 2014 at 3:30 PM, Matei Zaharia matei.zaha...@gmail.com
 wrote:
 +0.9 from me. Tested it on Mac and Windows (someone has to do it) and
 while things work, I noticed a few recent scripts don't have Windows
 equivalents, namely https://issues.apache.org/jira/browse/SPARK-4683 and
 https://issues.apache.org/jira/browse/SPARK-4684. The first one at least
 would be good to fix if we do another RC. Not blocking the release but
 useful to fix in docs is https://issues.apache.org/jira/browse/SPARK-4685.
 
 Matei
 
 
 On Dec 1, 2014, at 11:18 AM, Josh Rosen rosenvi...@gmail.com wrote:
 
 Hi everyone,
 
 There's an open bug report related to Spark standalone which could be a
 potential release-blocker (pending investigation / a bug fix):
 https://issues.apache.org/jira/browse/SPARK-4498.  This issue seems
 non-deterministc and only affects long-running Spark standalone
 deployments, so it may be hard to reproduce.  I'm going to work on a patch
 to add additional logging in order to help with debugging.
 
 I just wanted to give an early head's up about this issue and to get
 more eyes on it in case anyone else has run into it or wants to help with
 debugging.
 
 - Josh
 
 On November 28, 2014 at 9:18:09 PM, Patrick Wendell (pwend...@gmail.com)
 wrote:
 
 Please vote on releasing the following candidate as Apache Spark
 version 1.2.0!
 
 The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=
 1056e9ec13203d0c51564265e94d77a054498fdb
 
 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1/
 
 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc
 
 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1048/
 
 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/
 
 Please vote on releasing this package as Apache Spark 1.2.0!
 
 The vote is open until Tuesday, December 02, at 05:15 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.
 
 [ ] +1 Release this package as Apache Spark 1.1.0
 [ ] -1 Do not release this package because ...
 
 To learn more about Apache Spark, please see
 http://spark.apache.org/
 
 == What justifies a -1 vote for this release? ==
 This vote is happening very late into the QA period compared with
 previous votes, so -1 votes should only occur for significant
 regressions from 1.0.2. Bugs already present in 1.1.X, minor
 regressions, or bugs related to new features will not block this
 release.
 
 == What default changes should I be aware of? ==
 1. The default value of spark.shuffle.blockTransferService has been
 changed to netty
 -- Old behavior can be restored by switching to nio
 
 2. The default value of spark.shuffle.manager has been changed to
 sort.
 -- Old behavior can be restored by setting spark.shuffle.manager to
 hash.
 
 == Other notes ==
 Because this vote is occurring over a weekend, I will likely extend
 the vote if this RC survives until the end of the vote period.
 
 - Patrick
 
 

Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-02 Thread Andrew Or
+1. I also tested on Windows just in case, with jars referring other jars
and python files referring other python files. Path resolution still works.

2014-12-02 10:16 GMT-08:00 Jeremy Freeman freeman.jer...@gmail.com:

 +1 (non-binding)

 Installed version pre-built for Hadoop on a private HPC
 ran PySpark shell w/ iPython
 loaded data using custom Hadoop input formats
 ran MLlib routines in PySpark
 ran custom workflows in PySpark
 browsed the web UI

 Noticeable improvements in stability and performance during large shuffles
 (as well as the elimination of frequent but unpredictable “FileNotFound /
 too many open files” errors).

 We initially hit errors during large collects that ran fine in 1.1, but
 setting the new spark.driver.maxResultSize to 0 preserved the old behavior.
 Definitely worth highlighting this setting in the release notes, as the new
 default may be too small for some users and workloads.

 — Jeremy

 -
 jeremyfreeman.net
 @thefreemanlab

 On Dec 2, 2014, at 3:22 AM, Denny Lee denny.g@gmail.com wrote:

  +1 (non-binding)
 
  Verified on OSX 10.10.2, built from source,
  spark-shell / spark-submit jobs
  ran various simple Spark / Scala queries
  ran various SparkSQL queries (including HiveContext)
  ran ThriftServer service and connected via beeline
  ran SparkSVD
 
 
  On Mon Dec 01 2014 at 11:09:26 PM Patrick Wendell pwend...@gmail.com
  wrote:
 
  Hey All,
 
  Just an update. Josh, Andrew, and others are working to reproduce
  SPARK-4498 and fix it. Other than that issue no serious regressions
  have been reported so far. If we are able to get a fix in for that
  soon, we'll likely cut another RC with the patch.
 
  Continued testing of RC1 is definitely appreciated!
 
  I'll leave this vote open to allow folks to continue posting comments.
  It's fine to still give +1 from your own testing... i.e. you can
  assume at this point SPARK-4498 will be fixed before releasing.
 
  - Patrick
 
  On Mon, Dec 1, 2014 at 3:30 PM, Matei Zaharia matei.zaha...@gmail.com
  wrote:
  +0.9 from me. Tested it on Mac and Windows (someone has to do it) and
  while things work, I noticed a few recent scripts don't have Windows
  equivalents, namely https://issues.apache.org/jira/browse/SPARK-4683
 and
  https://issues.apache.org/jira/browse/SPARK-4684. The first one at
 least
  would be good to fix if we do another RC. Not blocking the release but
  useful to fix in docs is
 https://issues.apache.org/jira/browse/SPARK-4685.
 
  Matei
 
 
  On Dec 1, 2014, at 11:18 AM, Josh Rosen rosenvi...@gmail.com wrote:
 
  Hi everyone,
 
  There's an open bug report related to Spark standalone which could be
 a
  potential release-blocker (pending investigation / a bug fix):
  https://issues.apache.org/jira/browse/SPARK-4498.  This issue seems
  non-deterministc and only affects long-running Spark standalone
  deployments, so it may be hard to reproduce.  I'm going to work on a
 patch
  to add additional logging in order to help with debugging.
 
  I just wanted to give an early head's up about this issue and to get
  more eyes on it in case anyone else has run into it or wants to help
 with
  debugging.
 
  - Josh
 
  On November 28, 2014 at 9:18:09 PM, Patrick Wendell (
 pwend...@gmail.com)
  wrote:
 
  Please vote on releasing the following candidate as Apache Spark
  version 1.2.0!
 
  The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
  https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=
  1056e9ec13203d0c51564265e94d77a054498fdb
 
  The release files, including signatures, digests, etc. can be found
 at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc1/
 
  Release artifacts are signed with the following key:
  https://people.apache.org/keys/committer/pwendell.asc
 
  The staging repository for this release can be found at:
 
 https://repository.apache.org/content/repositories/orgapachespark-1048/
 
  The documentation corresponding to this release can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/
 
  Please vote on releasing this package as Apache Spark 1.2.0!
 
  The vote is open until Tuesday, December 02, at 05:15 UTC and passes
  if a majority of at least 3 +1 PMC votes are cast.
 
  [ ] +1 Release this package as Apache Spark 1.1.0
  [ ] -1 Do not release this package because ...
 
  To learn more about Apache Spark, please see
  http://spark.apache.org/
 
  == What justifies a -1 vote for this release? ==
  This vote is happening very late into the QA period compared with
  previous votes, so -1 votes should only occur for significant
  regressions from 1.0.2. Bugs already present in 1.1.X, minor
  regressions, or bugs related to new features will not block this
  release.
 
  == What default changes should I be aware of? ==
  1. The default value of spark.shuffle.blockTransferService has been
  changed to netty
  -- Old behavior can be restored by switching to nio
 
  2. The default value of spark.shuffle.manager has been 

Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-02 Thread Tom Graves
+1 tested on yarn.
Tom 

 On Friday, November 28, 2014 11:18 PM, Patrick Wendell 
pwend...@gmail.com wrote:
   

 Please vote on releasing the following candidate as Apache Spark version 1.2.0!

The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-rc1/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1048/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/

Please vote on releasing this package as Apache Spark 1.2.0!

The vote is open until Tuesday, December 02, at 05:15 UTC and passes
if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== What justifies a -1 vote for this release? ==
This vote is happening very late into the QA period compared with
previous votes, so -1 votes should only occur for significant
regressions from 1.0.2. Bugs already present in 1.1.X, minor
regressions, or bugs related to new features will not block this
release.

== What default changes should I be aware of? ==
1. The default value of spark.shuffle.blockTransferService has been
changed to netty
-- Old behavior can be restored by switching to nio

2. The default value of spark.shuffle.manager has been changed to sort.
-- Old behavior can be restored by setting spark.shuffle.manager to hash.

== Other notes ==
Because this vote is occurring over a weekend, I will likely extend
the vote if this RC survives until the end of the vote period.

- Patrick

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org





Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-01 Thread Sandy Ryza
+1 (non-binding)

built from source
fired up a spark-shell against YARN cluster
ran some jobs using parallelize
ran some jobs that read files
clicked around the web UI


On Sun, Nov 30, 2014 at 1:10 AM, GuoQiang Li wi...@qq.com wrote:

 +1 (non-binding‍)




 -- Original --
 From:  Patrick Wendell;pwend...@gmail.com;
 Date:  Sat, Nov 29, 2014 01:16 PM
 To:  dev@spark.apache.orgdev@spark.apache.org;

 Subject:  [VOTE] Release Apache Spark 1.2.0 (RC1)



 Please vote on releasing the following candidate as Apache Spark version
 1.2.0!

 The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):

 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb

 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1048/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/

 Please vote on releasing this package as Apache Spark 1.2.0!

 The vote is open until Tuesday, December 02, at 05:15 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.

 [ ] +1 Release this package as Apache Spark 1.1.0
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 == What justifies a -1 vote for this release? ==
 This vote is happening very late into the QA period compared with
 previous votes, so -1 votes should only occur for significant
 regressions from 1.0.2. Bugs already present in 1.1.X, minor
 regressions, or bugs related to new features will not block this
 release.

 == What default changes should I be aware of? ==
 1. The default value of spark.shuffle.blockTransferService has been
 changed to netty
 -- Old behavior can be restored by switching to nio

 2. The default value of spark.shuffle.manager has been changed to sort.
 -- Old behavior can be restored by setting spark.shuffle.manager to
 hash.

 == Other notes ==
 Because this vote is occurring over a weekend, I will likely extend
 the vote if this RC survives until the end of the vote period.

 - Patrick

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-01 Thread Josh Rosen
Hi everyone,

There’s an open bug report related to Spark standalone which could be a 
potential release-blocker (pending investigation / a bug fix): 
https://issues.apache.org/jira/browse/SPARK-4498.  This issue seems 
non-deterministc and only affects long-running Spark standalone deployments, so 
it may be hard to reproduce.  I’m going to work on a patch to add additional 
logging in order to help with debugging.

I just wanted to give an early head’s up about this issue and to get more eyes 
on it in case anyone else has run into it or wants to help with debugging.

- Josh

On November 28, 2014 at 9:18:09 PM, Patrick Wendell (pwend...@gmail.com) wrote:

Please vote on releasing the following candidate as Apache Spark version 1.2.0! 
 

The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):  
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb
  

The release files, including signatures, digests, etc. can be found at:  
http://people.apache.org/~pwendell/spark-1.2.0-rc1/  

Release artifacts are signed with the following key:  
https://people.apache.org/keys/committer/pwendell.asc  

The staging repository for this release can be found at:  
https://repository.apache.org/content/repositories/orgapachespark-1048/  

The documentation corresponding to this release can be found at:  
http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/  

Please vote on releasing this package as Apache Spark 1.2.0!  

The vote is open until Tuesday, December 02, at 05:15 UTC and passes  
if a majority of at least 3 +1 PMC votes are cast.  

[ ] +1 Release this package as Apache Spark 1.1.0  
[ ] -1 Do not release this package because ...  

To learn more about Apache Spark, please see  
http://spark.apache.org/  

== What justifies a -1 vote for this release? ==  
This vote is happening very late into the QA period compared with  
previous votes, so -1 votes should only occur for significant  
regressions from 1.0.2. Bugs already present in 1.1.X, minor  
regressions, or bugs related to new features will not block this  
release.  

== What default changes should I be aware of? ==  
1. The default value of spark.shuffle.blockTransferService has been  
changed to netty  
-- Old behavior can be restored by switching to nio  

2. The default value of spark.shuffle.manager has been changed to sort.  
-- Old behavior can be restored by setting spark.shuffle.manager to hash.  

== Other notes ==  
Because this vote is occurring over a weekend, I will likely extend  
the vote if this RC survives until the end of the vote period.  

- Patrick  

-  
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org  
For additional commands, e-mail: dev-h...@spark.apache.org  



Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-01 Thread Matei Zaharia
+0.9 from me. Tested it on Mac and Windows (someone has to do it) and while 
things work, I noticed a few recent scripts don't have Windows equivalents, 
namely https://issues.apache.org/jira/browse/SPARK-4683 and 
https://issues.apache.org/jira/browse/SPARK-4684. The first one at least would 
be good to fix if we do another RC. Not blocking the release but useful to fix 
in docs is https://issues.apache.org/jira/browse/SPARK-4685.

Matei


 On Dec 1, 2014, at 11:18 AM, Josh Rosen rosenvi...@gmail.com wrote:
 
 Hi everyone,
 
 There’s an open bug report related to Spark standalone which could be a 
 potential release-blocker (pending investigation / a bug fix): 
 https://issues.apache.org/jira/browse/SPARK-4498.  This issue seems 
 non-deterministc and only affects long-running Spark standalone deployments, 
 so it may be hard to reproduce.  I’m going to work on a patch to add 
 additional logging in order to help with debugging.
 
 I just wanted to give an early head’s up about this issue and to get more 
 eyes on it in case anyone else has run into it or wants to help with 
 debugging.
 
 - Josh
 
 On November 28, 2014 at 9:18:09 PM, Patrick Wendell (pwend...@gmail.com) 
 wrote:
 
 Please vote on releasing the following candidate as Apache Spark version 
 1.2.0!  
 
 The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):  
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb
   
 
 The release files, including signatures, digests, etc. can be found at:  
 http://people.apache.org/~pwendell/spark-1.2.0-rc1/  
 
 Release artifacts are signed with the following key:  
 https://people.apache.org/keys/committer/pwendell.asc  
 
 The staging repository for this release can be found at:  
 https://repository.apache.org/content/repositories/orgapachespark-1048/  
 
 The documentation corresponding to this release can be found at:  
 http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/  
 
 Please vote on releasing this package as Apache Spark 1.2.0!  
 
 The vote is open until Tuesday, December 02, at 05:15 UTC and passes  
 if a majority of at least 3 +1 PMC votes are cast.  
 
 [ ] +1 Release this package as Apache Spark 1.1.0  
 [ ] -1 Do not release this package because ...  
 
 To learn more about Apache Spark, please see  
 http://spark.apache.org/  
 
 == What justifies a -1 vote for this release? ==  
 This vote is happening very late into the QA period compared with  
 previous votes, so -1 votes should only occur for significant  
 regressions from 1.0.2. Bugs already present in 1.1.X, minor  
 regressions, or bugs related to new features will not block this  
 release.  
 
 == What default changes should I be aware of? ==  
 1. The default value of spark.shuffle.blockTransferService has been  
 changed to netty  
 -- Old behavior can be restored by switching to nio  
 
 2. The default value of spark.shuffle.manager has been changed to sort.  
 -- Old behavior can be restored by setting spark.shuffle.manager to 
 hash.  
 
 == Other notes ==  
 Because this vote is occurring over a weekend, I will likely extend  
 the vote if this RC survives until the end of the vote period.  
 
 - Patrick  
 
 -  
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org  
 For additional commands, e-mail: dev-h...@spark.apache.org  
 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-12-01 Thread Patrick Wendell
Hey All,

Just an update. Josh, Andrew, and others are working to reproduce
SPARK-4498 and fix it. Other than that issue no serious regressions
have been reported so far. If we are able to get a fix in for that
soon, we'll likely cut another RC with the patch.

Continued testing of RC1 is definitely appreciated!

I'll leave this vote open to allow folks to continue posting comments.
It's fine to still give +1 from your own testing... i.e. you can
assume at this point SPARK-4498 will be fixed before releasing.

- Patrick

On Mon, Dec 1, 2014 at 3:30 PM, Matei Zaharia matei.zaha...@gmail.com wrote:
 +0.9 from me. Tested it on Mac and Windows (someone has to do it) and while 
 things work, I noticed a few recent scripts don't have Windows equivalents, 
 namely https://issues.apache.org/jira/browse/SPARK-4683 and 
 https://issues.apache.org/jira/browse/SPARK-4684. The first one at least 
 would be good to fix if we do another RC. Not blocking the release but useful 
 to fix in docs is https://issues.apache.org/jira/browse/SPARK-4685.

 Matei


 On Dec 1, 2014, at 11:18 AM, Josh Rosen rosenvi...@gmail.com wrote:

 Hi everyone,

 There's an open bug report related to Spark standalone which could be a 
 potential release-blocker (pending investigation / a bug fix): 
 https://issues.apache.org/jira/browse/SPARK-4498.  This issue seems 
 non-deterministc and only affects long-running Spark standalone deployments, 
 so it may be hard to reproduce.  I'm going to work on a patch to add 
 additional logging in order to help with debugging.

 I just wanted to give an early head's up about this issue and to get more 
 eyes on it in case anyone else has run into it or wants to help with 
 debugging.

 - Josh

 On November 28, 2014 at 9:18:09 PM, Patrick Wendell (pwend...@gmail.com) 
 wrote:

 Please vote on releasing the following candidate as Apache Spark version 
 1.2.0!

 The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb

 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1048/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/

 Please vote on releasing this package as Apache Spark 1.2.0!

 The vote is open until Tuesday, December 02, at 05:15 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.

 [ ] +1 Release this package as Apache Spark 1.1.0
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 == What justifies a -1 vote for this release? ==
 This vote is happening very late into the QA period compared with
 previous votes, so -1 votes should only occur for significant
 regressions from 1.0.2. Bugs already present in 1.1.X, minor
 regressions, or bugs related to new features will not block this
 release.

 == What default changes should I be aware of? ==
 1. The default value of spark.shuffle.blockTransferService has been
 changed to netty
 -- Old behavior can be restored by switching to nio

 2. The default value of spark.shuffle.manager has been changed to sort.
 -- Old behavior can be restored by setting spark.shuffle.manager to 
 hash.

 == Other notes ==
 Because this vote is occurring over a weekend, I will likely extend
 the vote if this RC survives until the end of the vote period.

 - Patrick

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-11-30 Thread GuoQiang Li
+1 (non-binding‍)




-- Original --
From:  Patrick Wendell;pwend...@gmail.com;
Date:  Sat, Nov 29, 2014 01:16 PM
To:  dev@spark.apache.orgdev@spark.apache.org; 

Subject:  [VOTE] Release Apache Spark 1.2.0 (RC1)



Please vote on releasing the following candidate as Apache Spark version 1.2.0!

The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-rc1/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1048/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/

Please vote on releasing this package as Apache Spark 1.2.0!

The vote is open until Tuesday, December 02, at 05:15 UTC and passes
if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== What justifies a -1 vote for this release? ==
This vote is happening very late into the QA period compared with
previous votes, so -1 votes should only occur for significant
regressions from 1.0.2. Bugs already present in 1.1.X, minor
regressions, or bugs related to new features will not block this
release.

== What default changes should I be aware of? ==
1. The default value of spark.shuffle.blockTransferService has been
changed to netty
-- Old behavior can be restored by switching to nio

2. The default value of spark.shuffle.manager has been changed to sort.
-- Old behavior can be restored by setting spark.shuffle.manager to hash.

== Other notes ==
Because this vote is occurring over a weekend, I will likely extend
the vote if this RC survives until the end of the vote period.

- Patrick

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-11-29 Thread slcclimber
+1
1 Compiled binaries
2 All Tests Pass
3 Ran python and scala examples for spark and Mllib on local and master + 4
workers




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-2-0-RC1-tp9546p9552.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-11-29 Thread Patrick Wendell
Thanks for pointing this out, Matei. I don't think a minor typo like
this is a big deal. Hopefully it's clear to everyone this is the 1.2.0
release vote, as indicated by the subject and all of the artifacts.

On Sat, Nov 29, 2014 at 1:26 AM, Matei Zaharia matei.zaha...@gmail.com wrote:
 Hey Patrick, unfortunately you got some of the text here wrong, saying 1.1.0 
 instead of 1.2.0. Not sure it will matter since there can well be another RC 
 after testing, but we should be careful.

 Matei

 On Nov 28, 2014, at 9:16 PM, Patrick Wendell pwend...@gmail.com wrote:

 Please vote on releasing the following candidate as Apache Spark version 
 1.2.0!

 The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb

 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1048/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/

 Please vote on releasing this package as Apache Spark 1.2.0!

 The vote is open until Tuesday, December 02, at 05:15 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.

 [ ] +1 Release this package as Apache Spark 1.1.0
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 == What justifies a -1 vote for this release? ==
 This vote is happening very late into the QA period compared with
 previous votes, so -1 votes should only occur for significant
 regressions from 1.0.2. Bugs already present in 1.1.X, minor
 regressions, or bugs related to new features will not block this
 release.

 == What default changes should I be aware of? ==
 1. The default value of spark.shuffle.blockTransferService has been
 changed to netty
 -- Old behavior can be restored by switching to nio

 2. The default value of spark.shuffle.manager has been changed to sort.
 -- Old behavior can be restored by setting spark.shuffle.manager to 
 hash.

 == Other notes ==
 Because this vote is occurring over a weekend, I will likely extend
 the vote if this RC survives until the end of the vote period.

 - Patrick

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-11-29 Thread vaquar khan
+1
1 Compiled binaries
2 All Tests Pass

Regards,
Vaquar khan
On 30 Nov 2014 04:21, Krishna Sankar ksanka...@gmail.com wrote:

 +1
 1. Compiled OSX 10.10 (Yosemite) mvn -Pyarn -Phadoop-2.4
 -Dhadoop.version=2.4.0 -DskipTests clean package 16:46 min (slightly slower
 connection)
 2. Tested pyspark, mlib - running as well as compare esults with 1.1.x
 2.1. statistics OK
 2.2. Linear/Ridge/Laso Regression OK
Slight difference in the print method (vs. 1.1.x) of the model
 object - with a label  more details. This is good.
 2.3. Decision Tree, Naive Bayes OK
Changes in print(model) - now print (model.ToDebugString()) - OK
Some changes in NaiveBayes. Different from my 1.1.x code - had to
 flatten list structures, zip required same number in partitions
After code changes ran fine.
 2.4. KMeans OK
zip occasionally fails with error localhost):
 org.apache.spark.SparkException: Can only zip RDDs with same number of
 elements in each partition
 Has https://issues.apache.org/jira/browse/SPARK-2251 reappeared ?
 Made it work by doing a different transformation ie reusing an original
 rdd.
 2.5. rdd operations OK
State of the Union Texts - MapReduce, Filter,sortByKey (word count)
 2.6. recommendation OK
 2.7. Good work ! In 1.x.x, had a map distinct over the movielens medium
 dataset which never worked. Works fine in 1.2.0 !
 3. Scala Mlib - subset of examples as in #2 above, with Scala
 3.1. statistics OK
 3.2. Linear Regression OK
 3.3. Decision Tree OK
 3.4. KMeans OK
 Cheers
 k/
 P.S: Plan to add RF and .ml mechanics to this bank

 On Fri, Nov 28, 2014 at 9:16 PM, Patrick Wendell pwend...@gmail.com
 wrote:

  Please vote on releasing the following candidate as Apache Spark version
  1.2.0!
 
  The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
 
 
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb
 
  The release files, including signatures, digests, etc. can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc1/
 
  Release artifacts are signed with the following key:
  https://people.apache.org/keys/committer/pwendell.asc
 
  The staging repository for this release can be found at:
  https://repository.apache.org/content/repositories/orgapachespark-1048/
 
  The documentation corresponding to this release can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/
 
  Please vote on releasing this package as Apache Spark 1.2.0!
 
  The vote is open until Tuesday, December 02, at 05:15 UTC and passes
  if a majority of at least 3 +1 PMC votes are cast.
 
  [ ] +1 Release this package as Apache Spark 1.1.0
  [ ] -1 Do not release this package because ...
 
  To learn more about Apache Spark, please see
  http://spark.apache.org/
 
  == What justifies a -1 vote for this release? ==
  This vote is happening very late into the QA period compared with
  previous votes, so -1 votes should only occur for significant
  regressions from 1.0.2. Bugs already present in 1.1.X, minor
  regressions, or bugs related to new features will not block this
  release.
 
  == What default changes should I be aware of? ==
  1. The default value of spark.shuffle.blockTransferService has been
  changed to netty
  -- Old behavior can be restored by switching to nio
 
  2. The default value of spark.shuffle.manager has been changed to
 sort.
  -- Old behavior can be restored by setting spark.shuffle.manager to
  hash.
 
  == Other notes ==
  Because this vote is occurring over a weekend, I will likely extend
  the vote if this RC survives until the end of the vote period.
 
  - Patrick
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 



[VOTE] Release Apache Spark 1.2.0 (RC1)

2014-11-28 Thread Patrick Wendell
Please vote on releasing the following candidate as Apache Spark version 1.2.0!

The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb

The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-rc1/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1048/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/

Please vote on releasing this package as Apache Spark 1.2.0!

The vote is open until Tuesday, December 02, at 05:15 UTC and passes
if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 1.1.0
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see
http://spark.apache.org/

== What justifies a -1 vote for this release? ==
This vote is happening very late into the QA period compared with
previous votes, so -1 votes should only occur for significant
regressions from 1.0.2. Bugs already present in 1.1.X, minor
regressions, or bugs related to new features will not block this
release.

== What default changes should I be aware of? ==
1. The default value of spark.shuffle.blockTransferService has been
changed to netty
-- Old behavior can be restored by switching to nio

2. The default value of spark.shuffle.manager has been changed to sort.
-- Old behavior can be restored by setting spark.shuffle.manager to hash.

== Other notes ==
Because this vote is occurring over a weekend, I will likely extend
the vote if this RC survives until the end of the vote period.

- Patrick

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-11-28 Thread Reynold Xin
Krishna,

Docs don't block the rc voting because docs can be updated in parallel with
release candidates, until the point a release is made.


On Fri, Nov 28, 2014 at 9:55 PM, Krishna Sankar ksanka...@gmail.com wrote:

 Looks like the documentation hasn't caught up with the new features.
 On the machine learning side, for example org.apache.spark.ml,
 RandomForest, gbtree and so forth. Is a refresh of the documentation
 planned ?
 Am happy to see these capabilities, but these would need good explanations
 as well, especially the new thinking around the ml ... pipelines,
 transformations et al.
 IMHO, the documentation is a -1.
 Will check out the compilation, mlib et al

 Cheers
 k/

 On Fri, Nov 28, 2014 at 9:16 PM, Patrick Wendell pwend...@gmail.com
 wrote:

  Please vote on releasing the following candidate as Apache Spark version
  1.2.0!
 
  The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
 
 
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb
 
  The release files, including signatures, digests, etc. can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc1/
 
  Release artifacts are signed with the following key:
  https://people.apache.org/keys/committer/pwendell.asc
 
  The staging repository for this release can be found at:
  https://repository.apache.org/content/repositories/orgapachespark-1048/
 
  The documentation corresponding to this release can be found at:
  http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/
 
  Please vote on releasing this package as Apache Spark 1.2.0!
 
  The vote is open until Tuesday, December 02, at 05:15 UTC and passes
  if a majority of at least 3 +1 PMC votes are cast.
 
  [ ] +1 Release this package as Apache Spark 1.1.0
  [ ] -1 Do not release this package because ...
 
  To learn more about Apache Spark, please see
  http://spark.apache.org/
 
  == What justifies a -1 vote for this release? ==
  This vote is happening very late into the QA period compared with
  previous votes, so -1 votes should only occur for significant
  regressions from 1.0.2. Bugs already present in 1.1.X, minor
  regressions, or bugs related to new features will not block this
  release.
 
  == What default changes should I be aware of? ==
  1. The default value of spark.shuffle.blockTransferService has been
  changed to netty
  -- Old behavior can be restored by switching to nio
 
  2. The default value of spark.shuffle.manager has been changed to
 sort.
  -- Old behavior can be restored by setting spark.shuffle.manager to
  hash.
 
  == Other notes ==
  Because this vote is occurring over a weekend, I will likely extend
  the vote if this RC survives until the end of the vote period.
 
  - Patrick
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 



Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-11-28 Thread Matei Zaharia
Hey Patrick, unfortunately you got some of the text here wrong, saying 1.1.0 
instead of 1.2.0. Not sure it will matter since there can well be another RC 
after testing, but we should be careful.

Matei

 On Nov 28, 2014, at 9:16 PM, Patrick Wendell pwend...@gmail.com wrote:
 
 Please vote on releasing the following candidate as Apache Spark version 
 1.2.0!
 
 The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb
 
 The release files, including signatures, digests, etc. can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1/
 
 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc
 
 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1048/
 
 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/
 
 Please vote on releasing this package as Apache Spark 1.2.0!
 
 The vote is open until Tuesday, December 02, at 05:15 UTC and passes
 if a majority of at least 3 +1 PMC votes are cast.
 
 [ ] +1 Release this package as Apache Spark 1.1.0
 [ ] -1 Do not release this package because ...
 
 To learn more about Apache Spark, please see
 http://spark.apache.org/
 
 == What justifies a -1 vote for this release? ==
 This vote is happening very late into the QA period compared with
 previous votes, so -1 votes should only occur for significant
 regressions from 1.0.2. Bugs already present in 1.1.X, minor
 regressions, or bugs related to new features will not block this
 release.
 
 == What default changes should I be aware of? ==
 1. The default value of spark.shuffle.blockTransferService has been
 changed to netty
 -- Old behavior can be restored by switching to nio
 
 2. The default value of spark.shuffle.manager has been changed to sort.
 -- Old behavior can be restored by setting spark.shuffle.manager to hash.
 
 == Other notes ==
 Because this vote is occurring over a weekend, I will likely extend
 the vote if this RC survives until the end of the vote period.
 
 - Patrick
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org
 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org