+1 (non-binding) Tested Scala, SparkSQL, and MLLib on OSX against Hadoop 2.6
On Wed, Apr 8, 2015 at 5:35 PM Joseph Bradley <jos...@databricks.com> wrote: > +1 tested ML-related items on Mac OS X > > On Wed, Apr 8, 2015 at 7:59 PM, Krishna Sankar <ksanka...@gmail.com> > wrote: > > > +1 (non-binding, of course) > > > > 1. Compiled OSX 10.10 (Yosemite) OK Total time: 14:16 min > > mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 > > -Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11 > > 2. Tested pyspark, mlib - running as well as compare results with 1.3.0 > > pyspark works well with the new iPython 3.0.0 release > > 2.1. statistics (min,max,mean,Pearson,Spearman) OK > > 2.2. Linear/Ridge/Laso Regression OK > > 2.3. Decision Tree, Naive Bayes OK > > 2.4. KMeans OK > > Center And Scale OK > > 2.5. RDD operations OK > > State of the Union Texts - MapReduce, Filter,sortByKey (word count) > > 2.6. Recommendation (Movielens medium dataset ~1 M ratings) OK > > Model evaluation/optimization (rank, numIter, lambda) with > itertools > > OK > > 3. Scala - MLlib > > 3.1. statistics (min,max,mean,Pearson,Spearman) OK > > 3.2. LinearRegressionWithSGD OK > > 3.3. Decision Tree OK > > 3.4. KMeans OK > > 3.5. Recommendation (Movielens medium dataset ~1 M ratings) OK > > 4.0. Spark SQL from Python OK > > 4.1. result = sqlContext.sql("SELECT * from people WHERE State = 'WA'") > OK > > > > On Tue, Apr 7, 2015 at 10:46 PM, Patrick Wendell <pwend...@gmail.com> > > wrote: > > > > > Please vote on releasing the following candidate as Apache Spark > version > > > 1.3.1! > > > > > > The tag to be voted on is v1.3.1-rc2 (commit 7c4473a): > > > > > > > > https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h= > 7c4473aa5a7f5de0323394aaedeefbf9738e8eb5 > > > > > > The list of fixes present in this release can be found at: > > > http://bit.ly/1C2nVPY > > > > > > The release files, including signatures, digests, etc. can be found at: > > > http://people.apache.org/~pwendell/spark-1.3.1-rc2/ > > > > > > Release artifacts are signed with the following key: > > > https://people.apache.org/keys/committer/pwendell.asc > > > > > > The staging repository for this release can be found at: > > > https://repository.apache.org/content/repositories/ > orgapachespark-1083/ > > > > > > The documentation corresponding to this release can be found at: > > > http://people.apache.org/~pwendell/spark-1.3.1-rc2-docs/ > > > > > > The patches on top of RC1 are: > > > > > > [SPARK-6737] Fix memory leak in OutputCommitCoordinator > > > https://github.com/apache/spark/pull/5397 > > > > > > [SPARK-6636] Use public DNS hostname everywhere in spark_ec2.py > > > https://github.com/apache/spark/pull/5302 > > > > > > [SPARK-6205] [CORE] UISeleniumSuite fails for Hadoop 2.x test with > > > NoClassDefFoundError > > > https://github.com/apache/spark/pull/4933 > > > > > > Please vote on releasing this package as Apache Spark 1.3.1! > > > > > > The vote is open until Saturday, April 11, at 07:00 UTC and passes > > > if a majority of at least 3 +1 PMC votes are cast. > > > > > > [ ] +1 Release this package as Apache Spark 1.3.1 > > > [ ] -1 Do not release this package because ... > > > > > > To learn more about Apache Spark, please see > > > http://spark.apache.org/ > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > > > For additional commands, e-mail: dev-h...@spark.apache.org > > > > > > > > >