Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
Hi Yin, i’m using spark-hive dependency and tests for my app work for spark1.3.1. seems it’s something with hive sbt. Running from spark-shell next statement works, but from sbt console in rc3 i get next error: scala val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) 15/05/29 16:31:06 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 0.13.1aa sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@177ac9f4 scala val data = sqlContext.read.parquet(caches/-1525448137) SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder. SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. java.lang.IllegalArgumentException: Unable to locate hive jars to connect to metastore using classloader scala.tools.nsc.interpreter.IMain$TranslatingClassLoader. Please set spark.sql.hive.metastore.jars at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:206) at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:175) at org.apache.spark.sql.hive.HiveContext$$anon$2.init(HiveContext.scala:367) at org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:367) at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:366) at org.apache.spark.sql.hive.HiveContext$$anon$1.init(HiveContext.scala:379) at org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:379) at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:378) at org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:901) at org.apache.spark.sql.DataFrame.init(DataFrame.scala:134) at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51) at org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:419) at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:264) Thanks, Peter Rudenko On 2015-05-29 07:08, Yin Huai wrote: Justin, If you are creating multiple HiveContexts in tests, you need to assign a temporary metastore location for every HiveContext (like what we do at here https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala#L527-L543). Otherwise, they all try to connect to the metastore in the current dir (look at metastore_db). Peter, Do you also have the same use case as Justin (creating multiple HiveContexts in tests)? Can you explain what you meant by all tests? I am probably missing some context at here. Thanks, Yin On Thu, May 28, 2015 at 11:28 AM, Peter Rudenko petro.rude...@gmail.com mailto:petro.rude...@gmail.com wrote: Also have the same issue - all tests fail because of HiveContext / derby lock. |Cause: javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true, username = APP. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: -- [info] java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$anon$1@8066e0e, see the next exception for details. | Also is there build for hadoop2.6? Don’t see it here: http://people.apache.org/~pwendell/spark-releases/spark-1.4.0-rc2-bin/ http://people.apache.org/%7Epwendell/spark-releases/spark-1.4.0-rc2-bin/ Thanks, Peter Rudenko On 2015-05-22 22:56, Justin Uang wrote: I'm working on one of the Palantir teams using Spark, and here is our feedback: We have encountered three issues when upgrading to spark 1.4.0. I'm not sure they qualify as a -1, as they come from using non-public APIs and multiple spark contexts for the purposes of testing, but I do want to bring them up for awareness =) 1. Our UDT was serializing to a StringType, but now strings are represented internally as UTF8String, so we had to change our UDT to use UTF8String.apply() and UTF8String.toString() to convert back to String. 2. createDataFrame when using UDTs used to accept things in the serialized catalyst form. Now, they're supposed to be in the UDT java class form (I think this change would've affected us in 1.3.1 already, since we were in 1.3.0) 3. derby database lifecycle management issue with HiveContext. We have been using a SparkContextResource JUnit Rule that we wrote, and it sets up then tears down a SparkContext and HiveContext between unit test runs within the same process (possibly the same thread as well). Multiple contexts are not being used at once. It
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
Also have the same issue - all tests fail because of HiveContext / derby lock. |Cause: javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true, username = APP. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: -- [info] java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@8066e0e, see the next exception for details. | Also is there build for hadoop2.6? Don’t see it here: http://people.apache.org/~pwendell/spark-releases/spark-1.4.0-rc2-bin/ http://people.apache.org/%7Epwendell/spark-releases/spark-1.4.0-rc2-bin/ Thanks, Peter Rudenko On 2015-05-22 22:56, Justin Uang wrote: I'm working on one of the Palantir teams using Spark, and here is our feedback: We have encountered three issues when upgrading to spark 1.4.0. I'm not sure they qualify as a -1, as they come from using non-public APIs and multiple spark contexts for the purposes of testing, but I do want to bring them up for awareness =) 1. Our UDT was serializing to a StringType, but now strings are represented internally as UTF8String, so we had to change our UDT to use UTF8String.apply() and UTF8String.toString() to convert back to String. 2. createDataFrame when using UDTs used to accept things in the serialized catalyst form. Now, they're supposed to be in the UDT java class form (I think this change would've affected us in 1.3.1 already, since we were in 1.3.0) 3. derby database lifecycle management issue with HiveContext. We have been using a SparkContextResource JUnit Rule that we wrote, and it sets up then tears down a SparkContext and HiveContext between unit test runs within the same process (possibly the same thread as well). Multiple contexts are not being used at once. It used to work in 1.3.0, but now when we try to create the HiveContext for the second unit test, then it complains with the following exception. I have a feeling it might have something to do with the Hive object being thread local, and us not explicitly closing the HiveContext and everything it holds. The full stack trace is here: https://gist.github.com/justinuang/0403d49cdeedf91727cd Caused by: java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$anon$1@5dea2446, see the next exception for details. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) On Wed, May 20, 2015 at 10:35 AM Imran Rashid iras...@cloudera.com mailto:iras...@cloudera.com wrote: -1 discovered I accidentally removed master worker json endpoints, will restore https://issues.apache.org/jira/browse/SPARK-7760 On Tue, May 19, 2015 at 11:10 AM, Patrick Wendell pwend...@gmail.com mailto:pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ http://people.apache.org/%7Epwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ http://people.apache.org/%7Epwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
Justin, If you are creating multiple HiveContexts in tests, you need to assign a temporary metastore location for every HiveContext (like what we do at here https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala#L527-L543). Otherwise, they all try to connect to the metastore in the current dir (look at metastore_db). Peter, Do you also have the same use case as Justin (creating multiple HiveContexts in tests)? Can you explain what you meant by all tests? I am probably missing some context at here. Thanks, Yin On Thu, May 28, 2015 at 11:28 AM, Peter Rudenko petro.rude...@gmail.com wrote: Also have the same issue - all tests fail because of HiveContext / derby lock. Cause: javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true, username = APP. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: -- [info] java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@8066e0e, see the next exception for details. Also is there build for hadoop2.6? Don’t see it here: http://people.apache.org/%7Epwendell/spark-releases/spark-1.4.0-rc2-bin/ http://people.apache.org/~pwendell/spark-releases/spark-1.4.0-rc2-bin/ Thanks, Peter Rudenko On 2015-05-22 22:56, Justin Uang wrote: I'm working on one of the Palantir teams using Spark, and here is our feedback: We have encountered three issues when upgrading to spark 1.4.0. I'm not sure they qualify as a -1, as they come from using non-public APIs and multiple spark contexts for the purposes of testing, but I do want to bring them up for awareness =) 1. Our UDT was serializing to a StringType, but now strings are represented internally as UTF8String, so we had to change our UDT to use UTF8String.apply() and UTF8String.toString() to convert back to String. 2. createDataFrame when using UDTs used to accept things in the serialized catalyst form. Now, they're supposed to be in the UDT java class form (I think this change would've affected us in 1.3.1 already, since we were in 1.3.0) 3. derby database lifecycle management issue with HiveContext. We have been using a SparkContextResource JUnit Rule that we wrote, and it sets up then tears down a SparkContext and HiveContext between unit test runs within the same process (possibly the same thread as well). Multiple contexts are not being used at once. It used to work in 1.3.0, but now when we try to create the HiveContext for the second unit test, then it complains with the following exception. I have a feeling it might have something to do with the Hive object being thread local, and us not explicitly closing the HiveContext and everything it holds. The full stack trace is here: https://gist.github.com/justinuang/0403d49cdeedf91727cd https://gist.github.com/justinuang/0403d49cdeedf91727cd Caused by: java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$anon$1@5dea2446, see the next exception for details. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) On Wed, May 20, 2015 at 10:35 AM Imran Rashid iras...@cloudera.com wrote: -1 discovered I accidentally removed master worker json endpoints, will restore https://issues.apache.org/jira/browse/SPARK-7760 On Tue, May 19, 2015 at 11:10 AM, Patrick Wendell pwend...@gmail.com pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
Hey jameszhouyi, Since SPARK-7119 is not a regression from earlier versions, we won't hold the release for it. However, please comment on the JIRA if it is affecting you... it will help us prioritize the bug. - Patrick On Fri, May 22, 2015 at 8:41 PM, jameszhouyi yiaz...@gmail.com wrote: We came across a Spark SQL issue (https://issues.apache.org/jira/browse/SPARK-7119) that cause query to fail. I not sure that if vote -1 to this RC1. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-4-0-RC1-tp12321p12403.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
I'm working on one of the Palantir teams using Spark, and here is our feedback: We have encountered three issues when upgrading to spark 1.4.0. I'm not sure they qualify as a -1, as they come from using non-public APIs and multiple spark contexts for the purposes of testing, but I do want to bring them up for awareness =) 1. Our UDT was serializing to a StringType, but now strings are represented internally as UTF8String, so we had to change our UDT to use UTF8String.apply() and UTF8String.toString() to convert back to String. 2. createDataFrame when using UDTs used to accept things in the serialized catalyst form. Now, they're supposed to be in the UDT java class form (I think this change would've affected us in 1.3.1 already, since we were in 1.3.0) 3. derby database lifecycle management issue with HiveContext. We have been using a SparkContextResource JUnit Rule that we wrote, and it sets up then tears down a SparkContext and HiveContext between unit test runs within the same process (possibly the same thread as well). Multiple contexts are not being used at once. It used to work in 1.3.0, but now when we try to create the HiveContext for the second unit test, then it complains with the following exception. I have a feeling it might have something to do with the Hive object being thread local, and us not explicitly closing the HiveContext and everything it holds. The full stack trace is here: https://gist.github.com/justinuang/0403d49cdeedf91727cd Caused by: java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@5dea2446, see the next exception for details. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) On Wed, May 20, 2015 at 10:35 AM Imran Rashid iras...@cloudera.com wrote: -1 discovered I accidentally removed master worker json endpoints, will restore https://issues.apache.org/jira/browse/SPARK-7760 On Tue, May 19, 2015 at 11:10 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block this release. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
Thanks for catching this. I'll check with Patrick to see why the R API docs are not getting included. On Fri, May 22, 2015 at 2:44 PM, Andrew Psaltis psaltis.and...@gmail.com wrote: All, Should all the docs work from http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ ? If so the R API docs 404. On Tue, May 19, 2015 at 11:10 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block this release. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
All, Should all the docs work from http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ ? If so the R API docs 404. On Tue, May 19, 2015 at 11:10 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block this release. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
Thanks for the feedback. As you stated UDTs are explicitly not a public api as we knew we were going to be making breaking changes to them. We hope to stabilize / open them up in future releases. Regarding the Hive issue, have you tried using TestHive instead. This is what we use for testing and it takes care of creating temporary directories for all storage. It also has a reset() function that you can call in-between tests. If this doesn't work for you, maybe open a JIRA and we can discuss more there. On Fri, May 22, 2015 at 12:56 PM, Justin Uang justin.u...@gmail.com wrote: I'm working on one of the Palantir teams using Spark, and here is our feedback: We have encountered three issues when upgrading to spark 1.4.0. I'm not sure they qualify as a -1, as they come from using non-public APIs and multiple spark contexts for the purposes of testing, but I do want to bring them up for awareness =) 1. Our UDT was serializing to a StringType, but now strings are represented internally as UTF8String, so we had to change our UDT to use UTF8String.apply() and UTF8String.toString() to convert back to String. 2. createDataFrame when using UDTs used to accept things in the serialized catalyst form. Now, they're supposed to be in the UDT java class form (I think this change would've affected us in 1.3.1 already, since we were in 1.3.0) 3. derby database lifecycle management issue with HiveContext. We have been using a SparkContextResource JUnit Rule that we wrote, and it sets up then tears down a SparkContext and HiveContext between unit test runs within the same process (possibly the same thread as well). Multiple contexts are not being used at once. It used to work in 1.3.0, but now when we try to create the HiveContext for the second unit test, then it complains with the following exception. I have a feeling it might have something to do with the Hive object being thread local, and us not explicitly closing the HiveContext and everything it holds. The full stack trace is here: https://gist.github.com/justinuang/0403d49cdeedf91727cd Caused by: java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@5dea2446, see the next exception for details. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) On Wed, May 20, 2015 at 10:35 AM Imran Rashid iras...@cloudera.com wrote: -1 discovered I accidentally removed master worker json endpoints, will restore https://issues.apache.org/jira/browse/SPARK-7760 On Tue, May 19, 2015 at 11:10 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block this release. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
Thanks Andrew, the doc issue should be fixed in RC2 (if not, please chine in!). R was missing in the build envirionment. - Patrick On Fri, May 22, 2015 at 3:33 PM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: Thanks for catching this. I'll check with Patrick to see why the R API docs are not getting included. On Fri, May 22, 2015 at 2:44 PM, Andrew Psaltis psaltis.and...@gmail.com wrote: All, Should all the docs work from http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ ? If so the R API docs 404. On Tue, May 19, 2015 at 11:10 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block this release. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
We came across a Spark SQL issue (https://issues.apache.org/jira/browse/SPARK-7119) that cause query to fail. I not sure that if vote -1 to this RC1. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-4-0-RC1-tp12321p12403.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
Signature, hashes, LICENSE/NOTICE, source tarball looks OK. I built for Hadoop 2.6 (-Pyarn -Phive -Phadoop-2.6) on Ubuntu from source and tests pass. The release looks OK except that I'd like to resolve the Blockers before giving a +1. I'm seeing some test failures, and wanted to cross-check with others. They're all in Hive. Some I think are due to Java 8 differences and are just test issues; they expect an exact output from a query plan and some HashSet ordering differences make it trivially different. If so, I've seen this in the past and we could ignore it for now, but would be good to get a second set of eyes. The trace is big so it's at the end. When rerunning with Java 7 I get a different error due to Hive version support: - success sanity check *** FAILED *** java.lang.RuntimeException: [download failed: org.jboss.netty#netty;3.2.2.Final!netty.jar(bundle), download failed: commons-net#commons-net;3.1!commons-net.jar] at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:972) at org.apache.spark.sql.hive.client.IsolatedClientLoader$$anonfun$3.apply(IsolatedClientLoader.scala:62) ... Hive / possible Java 8 test issue: - windowing.q -- 20. testSTATs *** FAILED *** Results do not match for windowing.q -- 20. testSTATs: == Parsed Logical Plan == 'WithWindowDefinition Map(w1 - WindowSpecDefinition ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) 'Project ['p_mfgr,'p_name,'p_size,UnresolvedWindowExpression WindowSpecReference(w1) UnresolvedWindowFunction stddev UnresolvedAttribute [p_retailprice] AS sdev#159481,UnresolvedWindowExpression WindowSpecReference(w1) UnresolvedWindowFunction stddev_pop UnresolvedAttribute [p_retailprice] AS sdev_pop#159482,UnresolvedWindowExpression WindowSpecReference(w1) UnresolvedWindowFunction collect_set UnresolvedAttribute [p_size] AS uniq_size#159483,UnresolvedWindowExpression WindowSpecReference(w1) UnresolvedWindowFunction variance UnresolvedAttribute [p_retailprice] AS var#159484,UnresolvedWindowExpression WindowSpecReference(w1) UnresolvedWindowFunction corr UnresolvedAttribute [p_size] UnresolvedAttribute [p_retailprice] AS cor#159485,UnresolvedWindowExpression WindowSpecReference(w1) UnresolvedWindowFunction covar_pop UnresolvedAttribute [p_size] UnresolvedAttribute [p_retailprice] AS covarp#159486] 'UnresolvedRelation [part], None == Analyzed Logical Plan == p_mfgr: string, p_name: string, p_size: int, sdev: double, sdev_pop: double, uniq_size: arrayint, var: double, cor: double, covarp: double Project [p_mfgr#159489,p_name#159488,p_size#159492,sdev#159481,sdev_pop#159482,uniq_size#159483,var#159484,cor#159485,covarp#159486] Window [p_mfgr#159489,p_name#159488,p_size#159492,p_retailprice#159494], [HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFStd(p_retailprice#159494) WindowSpecDefinition ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING AS sdev#159481,HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFStd(p_retailprice#159494) WindowSpecDefinition ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING AS sdev_pop#159482,HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCollectSet(p_size#159492) WindowSpecDefinition ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING AS uniq_size#159483,HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFVariance(p_retailprice#159494) WindowSpecDefinition ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING AS var#159484,HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCorrelation(p_size#159492,p_retailprice#159494) WindowSpecDefinition ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING AS cor#159485,HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCovariance(p_size#159492,p_retailprice#159494) WindowSpecDefinition ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING AS covarp#159486], WindowSpecDefinition ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING Project [p_mfgr#159489,p_name#159488,p_size#159492,p_retailprice#159494] MetastoreRelation default, part, None == Optimized Logical Plan == Project [p_mfgr#159489,p_name#159488,p_size#159492,sdev#159481,sdev_pop#159482,uniq_size#159483,var#159484,cor#159485,covarp#159486] Window [p_mfgr#159489,p_name#159488,p_size#159492,p_retailprice#159494], [HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFStd(p_retailprice#159494) WindowSpecDefinition ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING AS sdev#159481,HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFStd(p_retailprice#159494) WindowSpecDefinition ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING AS sdev_pop#159482,HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCollectSet(p_size#159492) WindowSpecDefinition ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING AS uniq_size#159483,HiveWindowFunction#org.apache.hadoop.hive.ql.udf.generic.GenericUDAFVariance(p_retailprice#159494) WindowSpecDefinition ROWS BETWEEN 2
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
-1 discovered I accidentally removed master worker json endpoints, will restore https://issues.apache.org/jira/browse/SPARK-7760 On Tue, May 19, 2015 at 11:10 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block this release. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
Quick tests from my side - looks OK. The results are same or very similar to 1.3.1. Will add dataframes et al in future tests. +1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 17:42 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -DskipTests 2. Tested pyspark, mlib - running as well as compare results with 1.3.1 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3. Decision Tree, Naive Bayes OK 2.4. KMeans OK Center And Scale OK 2.5. RDD operations OK State of the Union Texts - MapReduce, Filter,sortByKey (word count) 2.6. Recommendation (Movielens medium dataset ~1 M ratings) OK Model evaluation/optimization (rank, numIter, lambda) with itertools OK Cheers k/ On Tue, May 19, 2015 at 9:10 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block this release. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
HI all, I've created another release repository where the release is identified with the version 1.4.0-rc1: https://repository.apache.org/content/repositories/orgapachespark-1093/ On Tue, May 19, 2015 at 5:36 PM, Krishna Sankar ksanka...@gmail.com wrote: Quick tests from my side - looks OK. The results are same or very similar to 1.3.1. Will add dataframes et al in future tests. +1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 17:42 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -DskipTests 2. Tested pyspark, mlib - running as well as compare results with 1.3.1 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3. Decision Tree, Naive Bayes OK 2.4. KMeans OK Center And Scale OK 2.5. RDD operations OK State of the Union Texts - MapReduce, Filter,sortByKey (word count) 2.6. Recommendation (Movielens medium dataset ~1 M ratings) OK Model evaluation/optimization (rank, numIter, lambda) with itertools OK Cheers k/ On Tue, May 19, 2015 at 9:10 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block this release. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
Punya, Let me see if I can publish these under rc1 as well. In the future this will all be automated but current it's a somewhat manual task. - Patrick On Tue, May 19, 2015 at 9:32 AM, Punyashloka Biswal punya.bis...@gmail.com wrote: When publishing future RCs to the staging repository, would it be possible to use a version number that includes the rc1 designation? In the current setup, when I run a build against the artifacts at https://repository.apache.org/content/repositories/orgapachespark-1092/org/apache/spark/spark-core_2.10/1.4.0/, my local Maven cache will get polluted with things that claim to be 1.4.0 but aren't. It would be preferable for the version number to be 1.4.0-rc1 instead. Thanks! Punya On Tue, May 19, 2015 at 12:20 PM Sean Owen so...@cloudera.com wrote: Before I vote, I wanted to point out there are still 9 Blockers for 1.4.0. I'd like to use this status to really mean must happen before the release. Many of these may be already fixed, or aren't really blockers -- can just be updated accordingly. I bet at least one will require further work if it's really meant for 1.4, so all this means is there is likely to be another RC. We should still kick the tires on RC1. (I also assume we should be extra conservative about what is merged into 1.4 at this point.) SPARK-6784 SQL Clean up all the inbound/outbound conversions for DateType Adrian Wang SPARK-6811 SparkR Building binary R packages for SparkR Shivaram Venkataraman SPARK-6941 SQL Provide a better error message to explain that tables created from RDDs are immutable SPARK-7158 SQL collect and take return different results SPARK-7478 SQL Add a SQLContext.getOrCreate to maintain a singleton instance of SQLContext Tathagata Das SPARK-7616 SQL Overwriting a partitioned parquet table corrupt data Cheng Lian SPARK-7654 SQL DataFrameReader and DataFrameWriter for input/output API Reynold Xin SPARK-7662 SQL Exception of multi-attribute generator anlysis in projection SPARK-7713 SQL Use shared broadcast hadoop conf for partitioned table scan. Yin Huai On Tue, May 19, 2015 at 5:10 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block this release. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
Thanks! I realize that manipulating the published version in the pom is a bit inconvenient but it's really useful to have clear version identifiers when we're juggling different versions and testing them out. For example, this will come in handy when we compare 1.4.0-rc1 and 1.4.0-rc2 in a couple of weeks :) Punya On Tue, May 19, 2015 at 12:39 PM Patrick Wendell pwend...@gmail.com wrote: Punya, Let me see if I can publish these under rc1 as well. In the future this will all be automated but current it's a somewhat manual task. - Patrick On Tue, May 19, 2015 at 9:32 AM, Punyashloka Biswal punya.bis...@gmail.com wrote: When publishing future RCs to the staging repository, would it be possible to use a version number that includes the rc1 designation? In the current setup, when I run a build against the artifacts at https://repository.apache.org/content/repositories/orgapachespark-1092/org/apache/spark/spark-core_2.10/1.4.0/ , my local Maven cache will get polluted with things that claim to be 1.4.0 but aren't. It would be preferable for the version number to be 1.4.0-rc1 instead. Thanks! Punya On Tue, May 19, 2015 at 12:20 PM Sean Owen so...@cloudera.com wrote: Before I vote, I wanted to point out there are still 9 Blockers for 1.4.0. I'd like to use this status to really mean must happen before the release. Many of these may be already fixed, or aren't really blockers -- can just be updated accordingly. I bet at least one will require further work if it's really meant for 1.4, so all this means is there is likely to be another RC. We should still kick the tires on RC1. (I also assume we should be extra conservative about what is merged into 1.4 at this point.) SPARK-6784 SQL Clean up all the inbound/outbound conversions for DateType Adrian Wang SPARK-6811 SparkR Building binary R packages for SparkR Shivaram Venkataraman SPARK-6941 SQL Provide a better error message to explain that tables created from RDDs are immutable SPARK-7158 SQL collect and take return different results SPARK-7478 SQL Add a SQLContext.getOrCreate to maintain a singleton instance of SQLContext Tathagata Das SPARK-7616 SQL Overwriting a partitioned parquet table corrupt data Cheng Lian SPARK-7654 SQL DataFrameReader and DataFrameWriter for input/output API Reynold Xin SPARK-7662 SQL Exception of multi-attribute generator anlysis in projection SPARK-7713 SQL Use shared broadcast hadoop conf for partitioned table scan. Yin Huai On Tue, May 19, 2015 at 5:10 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block this release. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
A couple of other process things: 1. Please *keep voting* (+1/-1) on this thread even if we find some issues, until we cut RC2. This lets us pipeline the QA. 2. The SQL team owes a JIRA clean-up (forthcoming shortly)... there are still a few Blocker's that aren't. On Tue, May 19, 2015 at 9:10 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block this release. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [VOTE] Release Apache Spark 1.4.0 (RC1)
Before I vote, I wanted to point out there are still 9 Blockers for 1.4.0. I'd like to use this status to really mean must happen before the release. Many of these may be already fixed, or aren't really blockers -- can just be updated accordingly. I bet at least one will require further work if it's really meant for 1.4, so all this means is there is likely to be another RC. We should still kick the tires on RC1. (I also assume we should be extra conservative about what is merged into 1.4 at this point.) SPARK-6784 SQL Clean up all the inbound/outbound conversions for DateType Adrian Wang SPARK-6811 SparkR Building binary R packages for SparkR Shivaram Venkataraman SPARK-6941 SQL Provide a better error message to explain that tables created from RDDs are immutable SPARK-7158 SQL collect and take return different results SPARK-7478 SQL Add a SQLContext.getOrCreate to maintain a singleton instance of SQLContext Tathagata Das SPARK-7616 SQL Overwriting a partitioned parquet table corrupt data Cheng Lian SPARK-7654 SQL DataFrameReader and DataFrameWriter for input/output API Reynold Xin SPARK-7662 SQL Exception of multi-attribute generator anlysis in projection SPARK-7713 SQL Use shared broadcast hadoop conf for partitioned table scan. Yin Huai On Tue, May 19, 2015 at 5:10 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.4.0! The tag to be voted on is v1.4.0-rc1 (commit 777a081): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=777a08166f1fb144146ba32581d4632c3466541e The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1092/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.4.0! The vote is open until Friday, May 22, at 17:03 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.4.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == How can I help test this release? == If you are a Spark user, you can help us test this release by taking a Spark 1.3 workload and running on this release candidate, then reporting any regressions. == What justifies a -1 vote for this release? == This vote is happening towards the end of the 1.4 QA period, so -1 votes should only occur for significant regressions from 1.3.1. Bugs already present in 1.3.X, minor regressions, or bugs related to new features will not block this release. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org