Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-30 Thread Patrick Wendell
Thanks to Nick Chammas and Cheng Lian who pointed out two issues with the release candidate. I'll cancel this in favor of RC3. On Fri, Aug 29, 2014 at 1:33 PM, Jeremy Freeman freeman.jer...@gmail.com wrote: +1. Validated several custom analysis pipelines on a private cluster in standalone mode.

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Sean Owen
+1 I tested the source and Hadoop 2.4 release. Checksums and signatures are OK. Compiles fine with Java 8 on OS X. Tests... don't fail any more than usual. FWIW I've also been using the 1.1.0-SNAPSHOT for some time in another project and have encountered no problems. I notice that the 1.1.0

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Patrick Wendell
Hey Sean, The reason there are no longer CDH-specific builds is that all newer versions of CDH and HDP work with builds for the upstream Hadoop projects. I dropped CDH4 in favor of a newer Hadoop version (2.4) and the Hadoop-without-Hive (also 2.4) build. For MapR - we can't officially post

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Sean Owen
(Copying my reply since I don't know if it goes to the mailing list) Great, thanks for explaining the reasoning. You're saying these aren't going into the final release? I think that moots any issue surrounding distributing them then. This is all I know of from the ASF:

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Matei Zaharia
Personally I'd actually consider putting CDH4 back if there are still users on it. It's always better to be inclusive, and the convenience of a one-click download is high. Do we have a sense on what % of CDH users still use CDH4? Matei On August 28, 2014 at 11:31:13 PM, Sean Owen

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Patrick Wendell
Yeah, we can't/won't post MapR binaries on the ASF web space for the release. However, I have been linking to them (at their request) with a clear identifier that it is an incompatible license and a 3rd party build. The only vendor specific build property we provide is compatibility with

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Sean Owen
On Fri, Aug 29, 2014 at 7:42 AM, Patrick Wendell pwend...@gmail.com wrote: In terms of vendor support for this approach - In the early days Cloudera asked us to add CDH4 repository and more recently Pivotal and MapR also asked us to allow linking against their hadoop-client libraries. So we've

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Andrew Ash
FWIW we use CDH4 extensively and would very much appreciate having a prebuilt version of Spark for it. We're doing a CDH 4.4 to 4.7 upgrade across all the clusters now and have plans for a 5.x transition after that. On Aug 28, 2014 11:57 PM, Sean Owen so...@cloudera.com wrote: On Fri, Aug 29,

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Koert Kuipers
i suspect there are more cdh4 than cdh5 clusters. most people plan to move to cdh5 within say 6 months. On Fri, Aug 29, 2014 at 3:57 AM, Andrew Ash and...@andrewash.com wrote: FWIW we use CDH4 extensively and would very much appreciate having a prebuilt version of Spark for it. We're doing

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Ye Xianjin
We just used CDH 4.7 for our production cluster. And I believe we won't use CDH 5 in the next year. Sent from my iPhone On 2014年8月29日, at 14:39, Matei Zaharia matei.zaha...@gmail.com wrote: Personally I'd actually consider putting CDH4 back if there are still users on it. It's always

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Patrick Wendell
Okay I'll plan to add cdh4 binary as well for the final release! --- sent from my phone On Aug 29, 2014 8:26 AM, Ye Xianjin advance...@gmail.com wrote: We just used CDH 4.7 for our production cluster. And I believe we won't use CDH 5 in the next year. Sent from my iPhone On 2014年8月29日, at

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Nicholas Chammas
There were several formatting and typographical errors in the SQL docs that I've fixed in this PR https://github.com/apache/spark/pull/2201. Dunno if we want to roll that into the release. On Fri, Aug 29, 2014 at 12:17 PM, Patrick Wendell pwend...@gmail.com wrote: Okay I'll plan to add cdh4

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Patrick Wendell
Hey Nicholas, Thanks for this, we can merge in doc changes outside of the actual release timeline, so we'll make sure to loop those changes in before we publish the final 1.1 docs. - Patrick On Fri, Aug 29, 2014 at 9:24 AM, Nicholas Chammas nicholas.cham...@gmail.com wrote: There were several

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Nicholas Chammas
[Let me know if I should be posting these comments in a different thread.] Should the default Spark version in spark-ec2 https://github.com/apache/spark/blob/e1535ad3c6f7400f2b7915ea91da9c60510557ba/ec2/spark_ec2.py#L86 be updated for this release? Nick ​ On Fri, Aug 29, 2014 at 12:55 PM,

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Patrick Wendell
Oh darn - I missed this update. GRR, unfortunately I think this means I'll need to cut a new RC. Thanks for catching this Nick. On Fri, Aug 29, 2014 at 10:18 AM, Nicholas Chammas nicholas.cham...@gmail.com wrote: [Let me know if I should be posting these comments in a different thread.] Should

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Marcelo Vanzin
In our internal projects we use this bit of code in the maven pom to create a properties file with build information (sorry for the messy indentation). Then we have code that reads this property file somewhere and provides that info. This should make it easier to not have to change version numbers

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Cheng Lian
Just noticed one thing: although --with-hive is deprecated by -Phive, make-distribution.sh still relies on $SPARK_HIVE (which was controlled by --with-hive) to determine whether to include datanucleus jar files. This means we have to do something like SPARK_HIVE=true ./make-distribution.sh ... to

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-29 Thread Jeremy Freeman
+1. Validated several custom analysis pipelines on a private cluster in standalone mode. Tested new PySpark support for arbitrary Hadoop input formats, works great! -- Jeremy -- View this message in context:

[VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Patrick Wendell
Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc2 (commit 711aebb3): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=711aebb329ca28046396af1e34395a0df92b5327 The release files, including signatures, digests, etc.

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Patrick Wendell
I'll kick off the vote with a +1. On Thu, Aug 28, 2014 at 7:14 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc2 (commit 711aebb3):

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Burak Yavuz
+1. Tested MLlib algorithms on Amazon EC2, algorithms show speed-ups between 1.5-5x compared to the 1.0.2 release. - Original Message - From: Patrick Wendell pwend...@gmail.com To: dev@spark.apache.org Sent: Thursday, August 28, 2014 8:32:11 PM Subject: Re: [VOTE] Release Apache Spark

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Timothy Chen
. - Original Message - From: Patrick Wendell pwend...@gmail.com To: dev@spark.apache.org Sent: Thursday, August 28, 2014 8:32:11 PM Subject: Re: [VOTE] Release Apache Spark 1.1.0 (RC2) I'll kick off the vote with a +1. On Thu, Aug 28, 2014 at 7:14 PM, Patrick Wendell pwend

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Cheng Lian
, 2014 8:32:11 PM Subject: Re: [VOTE] Release Apache Spark 1.1.0 (RC2) I'll kick off the vote with a +1. On Thu, Aug 28, 2014 at 7:14 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted