Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-26 Thread Stephen Hellberg
Yeah, I thought the vote was closed... but I couldn't think of a better thread to remark upon! That's a useful comment on Derby's role - thanks. Certainly, we'd just attempted a build-and-test execution with revising the Derby level to the current 10.12.1.1, and hadn't observed any issues... a PR

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-26 Thread Sean Owen
The release vote has already closed and passed. Derby is only used in tests AFAIK, so I don't think this is even critical let alone a blocker. Updating is fine though, open a PR. On Tue, Jul 26, 2016 at 3:37 PM, Stephen Hellberg wrote: > -1 Sorry, I've just noted that the

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-26 Thread Stephen Hellberg
-1 Sorry, I've just noted that the RC5 proposal includes shipping Derby @ 10.11.1.1 which is vulnerable to CVE: 2015-1832. It would be ideal if we could instead ship 10.12.1.1 real soon. -- View this message in context:

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-25 Thread Luciano Resende
When are we planning to push the release maven artifacts ? We are waiting for this in order to push an official Apache Bahir release supporting Spark 2.0. On Sat, Jul 23, 2016 at 7:05 AM, Reynold Xin wrote: > The vote has passed with the following +1 votes and no -1 votes.

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-23 Thread Ewan Leith
Ok cool, i didn't vote as I've done no real testing myself and i think the window had already closed anyway. I'm happy to wait for 2.0.1 for our systems. Thanks, Ewan On 23 Jul 2016 07:07, Reynold Xin wrote: Ewan not sure if you wanted to explicitly -1 so I didn’t include

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-23 Thread Reynold Xin
Ewan not sure if you wanted to explicitly -1 so I didn’t include you in that. I will document this as a known issue in the release notes. We have other bugs that we have fixed since RC5, and we can fix those together in 2.0.1. On July 22, 2016 at 10:24:32 PM, Ewan Leith

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-23 Thread Reynold Xin
The vote has passed with the following +1 votes and no -1 votes. I will work on packaging the new release next week. +1 Reynold Xin* Sean Owen* Shivaram Venkataraman* Jonathan Kelly Joseph E. Gonzalez* Krishna Sankar Dongjoon Hyun Ricardo Almeida Joseph Bradley* Matei Zaharia* Luciano Resende

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-22 Thread Ewan Leith
I think this new issue in JIRA blocks the release unfortunately? https://issues.apache.org/jira/browse/SPARK-16664 - Persist call on data frames with more than 200 columns is wiping out the data Otherwise there'll need to be 2.0.1 pretty much right after? Thanks, Ewan On 23 Jul 2016 03:46,

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-22 Thread Xiao Li
+1 2016-07-22 19:32 GMT-07:00 Kousuke Saruta : > +1 (non-binding) > > Tested on my cluster with three slave nodes. > > On 2016/07/23 10:25, Suresh Thalamati wrote: > > +1 (non-binding) > > Tested data source api , and jdbc data sources. > > > On Jul 19, 2016, at 7:35

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-22 Thread Kousuke Saruta
+1 (non-binding) Tested on my cluster with three slave nodes. On 2016/07/23 10:25, Suresh Thalamati wrote: +1 (non-binding) Tested data source api , and jdbc data sources. On Jul 19, 2016, at 7:35 PM, Reynold Xin > wrote: Please vote on

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-22 Thread Suresh Thalamati
+1 (non-binding) Tested data source api , and jdbc data sources. > On Jul 19, 2016, at 7:35 PM, Reynold Xin wrote: > > Please vote on releasing the following candidate as Apache Spark version > 2.0.0. The vote is open until Friday, July 22, 2016 at 20:00 PDT and passes

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-22 Thread Felix Cheung
+1 Tested on Ubuntu, ran a bunch of SparkR tests, found a broken link in doc but not a blocker. _ From: Michael Armbrust <mich...@databricks.com<mailto:mich...@databricks.com>> Sent: Friday, July 22, 2016 3:18 PM Subject: Re: [VOTE] Release Apache Spar

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-22 Thread Michael Armbrust
+1 On Fri, Jul 22, 2016 at 2:42 PM, Holden Karau wrote: > +1 (non-binding) > > Built locally on Ubuntu 14.04, basic pyspark sanity checking & tested with > a simple structured streaming project (spark-structured-streaming-ml) & > spark-testing-base &

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-22 Thread Holden Karau
+1 (non-binding) Built locally on Ubuntu 14.04, basic pyspark sanity checking & tested with a simple structured streaming project (spark-structured-streaming-ml) & spark-testing-base & high-performance-spark-examples (minor changes required from preview version but seem intentional & jetty

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-22 Thread Luciano Resende
+ 1 (non-binding) Found a minor issue when trying to run some of the docker tests, but nothing blocking the release. Will create a JIRA for that. On Tue, Jul 19, 2016 at 7:35 PM, Reynold Xin wrote: > Please vote on releasing the following candidate as Apache Spark version

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-22 Thread Matei Zaharia
+1 Tested on Mac. Matei > On Jul 22, 2016, at 11:18 AM, Joseph Bradley wrote: > > +1 > > Mainly tested ML/Graph/R. Perf tests from Tim Hunter showed minor speedups > from 1.6 for common ML algorithms. > > On Thu, Jul 21, 2016 at 9:41 AM, Ricardo Almeida >

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-22 Thread Joseph Bradley
+1 Mainly tested ML/Graph/R. Perf tests from Tim Hunter showed minor speedups from 1.6 for common ML algorithms. On Thu, Jul 21, 2016 at 9:41 AM, Ricardo Almeida < ricardo.alme...@actnowib.com> wrote: > +1 (non binding) > > Tested PySpark Core, DataFrame/SQL, MLlib and Streaming on a

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-20 Thread Reynold Xin
+1 On Wednesday, July 20, 2016, Krishna Sankar wrote: > +1 (non-binding, of course) > > 1. Compiled OS X 10.11.5 (El Capitan) OK Total time: 24:07 min > mvn clean package -Pyarn -Phadoop-2.7 -DskipTests > 2. Tested pyspark, mllib (iPython 4.0) > 2.0 Spark version is

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-20 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OS X 10.11.5 (El Capitan) OK Total time: 24:07 min mvn clean package -Pyarn -Phadoop-2.7 -DskipTests 2. Tested pyspark, mllib (iPython 4.0) 2.0 Spark version is 2.0.0 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Lasso Regression

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-20 Thread Joseph Gonzalez
+1 Sent from my iPad - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-20 Thread Michael Allman
I've run some tests with some real and some synthetic parquet data with nested columns with and without the hive metastore on our Spark 1.5, 1.6 and 2.0 versions. I haven't seen any unexpected performance surprises, except that Spark 2.0 now does schema inference across all files in a

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-20 Thread Maciej Bryński
@Michael, I answered in Jira and could repeat here. I think that my problem is unrelated to Hive, because I'm using read.parquet method. I also attached some VisualVM snapshots to SPARK-16321 (I think I should merge both issues) And code profiling suggest bottleneck when reading parquet file. I

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-20 Thread Marcin Tustin
I refer to Maciej Bryński's (mac...@brynski.pl) emails of 29 and 30 June 2016 to this list. He said that his benchmarking suggested that Spark 2.0 was slower than 1.6. I'm wondering if that was ever investigated, and if so if the speed is back up, or not. On Wed, Jul 20, 2016 at 12:18 PM,

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-20 Thread Michael Allman
Marcin, I'm not sure what you're referring to. Can you be more specific? Cheers, Michael > On Jul 20, 2016, at 9:10 AM, Marcin Tustin wrote: > > Whatever happened with the query regarding benchmarks? Is that resolved? > > On Tue, Jul 19, 2016 at 10:35 PM, Reynold Xin

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-20 Thread Marcin Tustin
Whatever happened with the query regarding benchmarks? Is that resolved? On Tue, Jul 19, 2016 at 10:35 PM, Reynold Xin wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.0.0. The vote is open until Friday, July 22, 2016 at 20:00 PDT and

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-20 Thread Shivaram Venkataraman
+1 SHA and MD5 sums match for all binaries. Docs look fine this time around. Built and ran `dev/run-tests` with Java 7 on a linux machine. No blocker bugs on JIRA and the only critical bug with target as 2.0.0 is SPARK-16633, which doesn't look like a release blocker. I also checked issues which

[VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-19 Thread Reynold Xin
Please vote on releasing the following candidate as Apache Spark version 2.0.0. The vote is open until Friday, July 22, 2016 at 20:00 PDT and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 2.0.0 [ ] -1 Do not release this package because ...