git commit: [SPARK-1100] prevent Spark from overwriting directory silently

2014-03-01 Thread pwendell
Repository: spark Updated Branches: refs/heads/master fe195ae11 - 3a8b698e9 [SPARK-1100] prevent Spark from overwriting directory silently Thanks for Diana Carroll to report this issue (https://spark-project.atlassian.net/browse/SPARK-1100) the current saveAsTextFile/SequenceFile will

git commit: SPARK-1084.2 (resubmitted)

2014-03-02 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 353ac6b4f - fd31adbf2 SPARK-1084.2 (resubmitted) (Ported from https://github.com/apache/incubator-spark/pull/650 ) This adds one more change though, to fix the scala version warning introduced by json4s recently. Author: Sean Owen

svn commit: r1573412 [2/2] - in /spark/site/docs/0.9.0: ./ api/core/org/apache/spark/ api/core/org/apache/spark/scheduler/ api/mllib/ api/mllib/index/ api/mllib/org/apache/spark/mllib/recommendation/

2014-03-02 Thread pwendell
Modified: spark/site/docs/0.9.0/api/pyspark/pyspark.mllib.regression.RidgeRegressionWithSGD-class.html URL: http://svn.apache.org/viewvc/spark/site/docs/0.9.0/api/pyspark/pyspark.mllib.regression.RidgeRegressionWithSGD-class.html?rev=1573412r1=1573411r2=1573412view=diff

svn commit: r1573418 - /spark/site/docs/0.9.0/

2014-03-02 Thread pwendell
Author: pwendell Date: Mon Mar 3 01:57:26 2014 New Revision: 1573418 URL: http://svn.apache.org/r1573418 Log: Adding google analytics to the 0.9 docs. Modified: spark/site/docs/0.9.0/README.md spark/site/docs/0.9.0/api.html spark/site/docs/0.9.0/bagel-programming-guide.html

git commit: SPARK-1184: Update the distribution tar.gz to include spark-assembly jar

2014-03-05 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-0.9 7ea89ec45 - 0fc0fdb12 SPARK-1184: Update the distribution tar.gz to include spark-assembly jar See JIRA for details. Author: Mark Grover m...@apache.org Closes #78 from markgrover/SPARK-1184 and squashes the following commits:

git commit: SPARK-1187, Added missing Python APIs

2014-03-06 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 3eb009f36 - 3d3acef04 SPARK-1187, Added missing Python APIs The following Python APIs are added, RDD.id() SparkContext.setJobGroup() SparkContext.setLocalProperty() SparkContext.getLocalProperty() SparkContext.sparkUser() was raised

git commit: SPARK-942: Do not materialize partitions when DISK_ONLY storage level is used

2014-03-06 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 3d3acef04 - 40566e10a SPARK-942: Do not materialize partitions when DISK_ONLY storage level is used This is a port of a pull request original targeted at incubator-spark: https://github.com/apache/incubator-spark/pull/180 Essentially if

git commit: Small clean-up to flatmap tests

2014-03-06 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 9ae919c02 - 33baf14b0 Small clean-up to flatmap tests Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/33baf14b Tree:

git commit: [SPARK-1194] Fix the same-RDD rule for cache replacement

2014-03-07 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 8ad486add - 0b7b7fd45 [SPARK-1194] Fix the same-RDD rule for cache replacement SPARK-1194: https://spark-project.atlassian.net/browse/SPARK-1194 In the current implementation, when selecting candidate blocks to be swapped out, once we

git commit: Fix markup errors introduced in #33 (SPARK-1189)

2014-03-09 Thread pwendell
Repository: spark Updated Branches: refs/heads/master f6f9d02e8 - faf4cad1d Fix markup errors introduced in #33 (SPARK-1189) These were causing errors on the configuration page. Author: Patrick Wendell pwend...@gmail.com Closes #111 from pwendell/master and squashes the following commits

git commit: SPARK-1205: Clean up callSite/origin/generator.

2014-03-10 Thread pwendell
this. Author: Patrick Wendell pwend...@gmail.com Closes #106 from pwendell/callsite and squashes the following commits: fc1d009 [Patrick Wendell] Compile fix e17fb76 [Patrick Wendell] Review feedback: callSite - creationSite 62e77ef [Patrick Wendell] Review feedback 576e60b [Patrick Wendell] SPARK

git commit: [SPARK-1232] Fix the hadoop 0.23 yarn build

2014-03-12 Thread pwendell
Repository: spark Updated Branches: refs/heads/master af7f2f109 - c8c59b326 [SPARK-1232] Fix the hadoop 0.23 yarn build Author: Thomas Graves tgra...@apache.org Closes #127 from tgravescs/SPARK-1232 and squashes the following commits: c05cfd4 [Thomas Graves] Fix the hadoop 0.23 yarn build

git commit: Fix example bug: compile error

2014-03-12 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 9032f7c0d - 31a704004 Fix example bug: compile error Author: jianghan jiang...@xiaomi.com Closes #132 from pooorman/master and squashes the following commits: 54afbe0 [jianghan] Fix example bug: compile error Project:

git commit: Fix example bug: compile error

2014-03-12 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-0.9 51a77e977 - 87e4dd58c Fix example bug: compile error Author: jianghan jiang...@xiaomi.com Closes #132 from pooorman/master and squashes the following commits: 54afbe0 [jianghan] Fix example bug: compile error (cherry picked from

git commit: SPARK-1236 - Upgrade Jetty to 9.1.3.v20140225.

2014-03-13 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 698373211 - ca4bf8c57 SPARK-1236 - Upgrade Jetty to 9.1.3.v20140225. Author: Reynold Xin r...@apache.org Closes #113 from rxin/jetty9 and squashes the following commits: 867a2ce [Reynold Xin] Updated Jetty version to 9.1.3.v20140225 in

git commit: [bugfix] wrong client arg, should use executor-cores

2014-03-13 Thread pwendell
Repository: spark Updated Branches: refs/heads/master ca4bf8c57 - 181b130a0 [bugfix] wrong client arg, should use executor-cores client arg is wrong, it should be executor-cores. it causes executor fail to start when executor-cores is specified Author: Tianshuo Deng td...@twitter.com

[2/4] git commit: SPARK-1168, Added foldByKey to pyspark.

2014-03-16 Thread pwendell
SPARK-1168, Added foldByKey to pyspark. Author: Prashant Sharma prashan...@imaginea.com Closes #115 from ScrapCodes/SPARK-1168/pyspark-foldByKey and squashes the following commits: db6f67e [Prashant Sharma] SPARK-1168, Added foldByKey to pyspark. Project:

git commit: [Hot Fix #42] Do not stop SparkUI if bind() is not called

2014-03-20 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 66a03e5fe - ca76423e2 [Hot Fix #42] Do not stop SparkUI if bind() is not called This is a bug fix for #42 (79d07d66040f206708e14de393ab0b80020ed96a). In Master, we do not bind() each SparkUI because we do not want to start a server for

git commit: Renamed stageIdToActiveJob to jobIdToActiveJob.

2014-04-02 Thread pwendell
Repository: spark Updated Branches: refs/heads/master ea9de658a - 11973a7bd Renamed stageIdToActiveJob to jobIdToActiveJob. This data structure was misused and, as a result, later renamed to an incorrect name. This data structure seems to have gotten into this tangled state as a result of

[1/2] [SPARK-1371][WIP] Compression support for Spark SQL in-memory columnar storage

2014-04-02 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 78236334e - 1faa57971 http://git-wip-us.apache.org/repos/asf/spark/blob/1faa5797/sql/core/src/test/scala/org/apache/spark/sql/columnar/NullableColumnBuilderSuite.scala --

git commit: StopAfter / TopK related changes

2014-04-02 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 1faa57971 - ed730c950 StopAfter / TopK related changes 1. Renamed StopAfter to Limit to be more consistent with naming in other relational databases. 2. Renamed TopK to TakeOrdered to be more consistent with Spark RDD API. 3. Avoid

git commit: [SQL] SPARK-1364 Improve datatype and test coverage for ScalaReflection schema inference.

2014-04-02 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 9c65fa76f - 47ebea546 [SQL] SPARK-1364 Improve datatype and test coverage for ScalaReflection schema inference. Author: Michael Armbrust mich...@databricks.com Closes #293 from marmbrus/reflectTypes and squashes the following commits:

git commit: SPARK-1337: Application web UI garbage collects newest stages instead old ones

2014-04-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/stage-clean-up [created] 64c593ed0 SPARK-1337: Application web UI garbage collects newest stages instead old ones Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/64c593ed

git commit: SPARK-1337: Application web UI garbage collects newest stages

2014-04-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-0.9 d9c7a808c - 7f727cf97 SPARK-1337: Application web UI garbage collects newest stages Simple fix... Author: Patrick Wendell pwend...@gmail.com Closes #320 from pwendell/stage-clean-up and squashes the following commits: 29be62e

git commit: SPARK-1404: Always upgrade spark-env.sh vars to environment vars

2014-04-04 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 7f32fd42a - 01cf4c402 SPARK-1404: Always upgrade spark-env.sh vars to environment vars This was broken when spark-env.sh was made idempotent, as the idempotence check is an environment variable, but the spark-env.sh variables may not have

git commit: Add test utility for generating Jar files with compiled classes.

2014-04-04 Thread pwendell
in. Author: Patrick Wendell pwend...@gmail.com Closes #326 from pwendell/class-loader-test-utils and squashes the following commits: ff3e88e [Patrick Wendell] Add test utility for generating Jar files with compiled classes. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit

git commit: [SPARK-1419] Bumped parent POM to apache 14

2014-04-04 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 5f3c1bb51 - 1347ebd4b [SPARK-1419] Bumped parent POM to apache 14 Keeping up-to-date with the parent, which includes some bugfixes. Author: Mark Hamstra markhams...@gmail.com Closes #328 from markhamstra/Apache14 and squashes the

git commit: Fix for PR #195 for Java 6

2014-04-05 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 6e88583ae - 890d63bd4 Fix for PR #195 for Java 6 Use Java 6's recommended equivalent of Java 7's Logger.getGlobal() to retain Java 6 compatibility. See PR #195 Author: Sean Owen so...@cloudera.com Closes #334 from

git commit: Fix SPARK-1420 The maven build error for Spark Catalyst

2014-04-06 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 0b8551678 - 7012ffafa Fix SPARK-1420 The maven build error for Spark Catalyst Author: witgo wi...@qq.com Closes #333 from witgo/SPARK-1420 and squashes the following commits: 902519e [witgo] add dependency scala-reflect to catalyst

git commit: [SPARK-1259] Make RDD locally iterable

2014-04-06 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 7012ffafa - e258e5040 [SPARK-1259] Make RDD locally iterable Author: Egor Pakhomov pahomov.e...@gmail.com Closes #156 from epahomov/SPARK-1259 and squashes the following commits: 8ec8f24 [Egor Pakhomov] Make to local iterator shorter

git commit: SPARK-1387. Update build plugins, avoid plugin version warning, centralize versions

2014-04-06 Thread pwendell
Repository: spark Updated Branches: refs/heads/master e258e5040 - 856c50f59 SPARK-1387. Update build plugins, avoid plugin version warning, centralize versions Another handful of small build changes to organize and standardize a bit, and avoid warnings: - Update Maven plugin versions for

git commit: SPARK-1349: spark-shell gets its own command history

2014-04-06 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 856c50f59 - 7ce52c4a7 SPARK-1349: spark-shell gets its own command history Currently, spark-shell shares its command history with scala repl. This fix is simply a modification of the default FileBackedHistory file setting:

git commit: SPARK-1314: Use SPARK_HIVE to determine if we include Hive in packaging

2014-04-06 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 7ce52c4a7 - 410655843 SPARK-1314: Use SPARK_HIVE to determine if we include Hive in packaging Previously, we based our decision regarding including datanucleus jars based on the existence of a spark-hive-assembly jar, which was

git commit: SPARK-1154: Clean up app folders in worker nodes

2014-04-06 Thread pwendell
velvia/SPARK-1154-cleanup-app-folders and squashes the following commits: 0689995 [Evan Chan] CR from @aarondav - move config, clarify for standalone mode 9f10d96 [Evan Chan] CR from @pwendell - rename configs and add cleanup.enabled f2f6027 [Evan Chan] CR from @andrewor14 553d8c2 [Kelvin Chu] change

git commit: SPARK-1431: Allow merging conflicting pull requests

2014-04-06 Thread pwendell
pwend...@gmail.com Closes #342 from pwendell/merge-conflicts and squashes the following commits: cdce61a [Patrick Wendell] SPARK-1431: Allow merging conflicting pull requests Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit

git commit: SPARK-1432: Make sure that all metadata fields are properly cleaned

2014-04-07 Thread pwendell
Repository: spark Updated Branches: refs/heads/master b5bae849d - a3c51c6ea SPARK-1432: Make sure that all metadata fields are properly cleaned While working on spark-1337 with @pwendell, we noticed that not all of the metadata maps in JobProgessListener were being properly cleaned

git commit: SPARK-1433: Upgrade Mesos dependency to 0.17.0

2014-04-08 Thread pwendell
Repository: spark Updated Branches: refs/heads/master fac6085cd - 12c077d5a SPARK-1433: Upgrade Mesos dependency to 0.17.0 Mesos 0.13.0 was released 6 months ago. Upgrade Mesos dependency to 0.17.0 Author: Sandeep sand...@techaddict.me Closes #355 from techaddict/mesos_update and squashes

git commit: [SPARK-1434] [MLLIB] change labelParser from anonymous function to trait

2014-04-08 Thread pwendell
Repository: spark Updated Branches: refs/heads/master ce8ec5456 - b9e0c937d [SPARK-1434] [MLLIB] change labelParser from anonymous function to trait This is a patch to address @mateiz 's comment in https://github.com/apache/spark/pull/245 MLUtils#loadLibSVMData uses an anonymous function

git commit: Spark-939: allow user jars to take precedence over spark jars

2014-04-08 Thread pwendell
Repository: spark Updated Branches: refs/heads/master b9e0c937d - fa0524fd0 Spark-939: allow user jars to take precedence over spark jars I still need to do a small bit of re-factoring [mostly the one Java file I'll switch it back to a Scala file and use it in both the close loaders], but

[1/2] [SPARK-1390] Refactoring of matrices backed by RDDs

2014-04-09 Thread pwendell
Repository: spark Updated Branches: refs/heads/master fa0524fd0 - 9689b663a http://git-wip-us.apache.org/repos/asf/spark/blob/9689b663/mllib/src/test/scala/org/apache/spark/mllib/linalg/SVDSuite.scala -- diff --git

git commit: SPARK-1407 drain event queue before stopping event logger

2014-04-09 Thread pwendell
Repository: spark Updated Branches: refs/heads/master bde9cc11f - eb5f2b642 SPARK-1407 drain event queue before stopping event logger Author: Kan Zhang kzh...@apache.org Closes #366 from kanzhang/SPARK-1407 and squashes the following commits: cd0629f [Kan Zhang] code refactoring and adding

git commit: [SPARK-1357 (fix)] remove empty line after :: DeveloperApi/Experimental ::

2014-04-09 Thread pwendell
Repository: spark Updated Branches: refs/heads/master eb5f2b642 - 0adc932ad [SPARK-1357 (fix)] remove empty line after :: DeveloperApi/Experimental :: Remove empty line after :: DeveloperApi/Experimental :: in comments to make the original doc show up in the preview of the generated html

[1/3] git commit: SPARK-1407 drain event queue before stopping event logger

2014-04-10 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 bde9cc11f - 8ca3b2bc9 SPARK-1407 drain event queue before stopping event logger Author: Kan Zhang kzh...@apache.org Closes #366 from kanzhang/SPARK-1407 and squashes the following commits: cd0629f [Kan Zhang] code refactoring and

[3/3] git commit: SPARK-729: Closures not always serialized at capture time

2014-04-10 Thread pwendell
SPARK-729: Closures not always serialized at capture time [SPARK-729](https://spark-project.atlassian.net/browse/SPARK-729) concerns when free variables in closure arguments to transformations are captured. Currently, it is possible for closures to get the environment in which they are

[2/3] git commit: [SPARK-1357 (fix)] remove empty line after :: DeveloperApi/Experimental ::

2014-04-10 Thread pwendell
[SPARK-1357 (fix)] remove empty line after :: DeveloperApi/Experimental :: Remove empty line after :: DeveloperApi/Experimental :: in comments to make the original doc show up in the preview of the generated html docs. Thanks @andrewor14 ! Author: Xiangrui Meng m...@databricks.com Closes #373

git commit: SPARK-1446: Spark examples should not do a System.exit

2014-04-10 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 8ca3b2bc9 - e55cc4bae SPARK-1446: Spark examples should not do a System.exit Spark examples should exit nice using SparkContext.stop() method, rather than System.exit System.exit can cause issues like in SPARK-1407 Author: Sandeep

git commit: Revert SPARK-729: Closures not always serialized at capture time

2014-04-10 Thread pwendell
Repository: spark Updated Branches: refs/heads/master e55cc4bae - e6d4a74d2 Revert SPARK-729: Closures not always serialized at capture time This reverts commit 8ca3b2bc90a63b23a03f339e390174cd7a672b40. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

[2/3] git commit: Revert SPARK-729: Closures not always serialized at capture time

2014-04-10 Thread pwendell
Revert SPARK-729: Closures not always serialized at capture time This reverts commit 8ca3b2bc90a63b23a03f339e390174cd7a672b40. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e6d4a74d Tree:

[3/3] git commit: Fix SPARK-1413: Parquet messes up stdout and stdin when used in Spark REPL

2014-04-10 Thread pwendell
Fix SPARK-1413: Parquet messes up stdout and stdin when used in Spark REPL Author: witgo wi...@qq.com Closes #325 from witgo/SPARK-1413 and squashes the following commits: e57cd8e [witgo] use scala reflection to access and call the SLF4JBridgeHandler methods 45c8f40 [witgo] Merge branch

[1/2] [SPARK-1276] Add a HistoryServer to render persisted UI

2014-04-10 Thread pwendell
Repository: spark Updated Branches: refs/heads/master a74fbbbca - 79820fe82 http://git-wip-us.apache.org/repos/asf/spark/blob/79820fe8/core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala -- diff --git

[1/2] [SPARK-1276] Add a HistoryServer to render persisted UI

2014-04-10 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 a74fbbbca - 9ae80bf9b http://git-wip-us.apache.org/repos/asf/spark/blob/9ae80bf9/core/src/test/scala/org/apache/spark/util/JsonProtocolSuite.scala -- diff --git

git commit: Revert SPARK-1433: Upgrade Mesos dependency to 0.17.0

2014-04-10 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 3bd312940 - 7b52b6631 Revert SPARK-1433: Upgrade Mesos dependency to 0.17.0 This reverts commit 12c077d5aa0b76a808a55db625c9677a52bd43f9. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

git commit: SPARK-1202 - Add a cancel button in the UI for stages

2014-04-10 Thread pwendell
Repository: spark Updated Branches: refs/heads/master f99401a63 - 2c557837b SPARK-1202 - Add a cancel button in the UI for stages Author: Sundeep Narravula sundeepn@superduel.local Author: Sundeep Narravula sunde...@dhcpx-204-110.corp.yahoo.com Closes #246 from sundeepn/uikilljob and

git commit: Add Spark v0.9.1 to ec2 launch script and use it as the default

2014-04-10 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 41df293fb - 59de39b2e Add Spark v0.9.1 to ec2 launch script and use it as the default Mainly ported from branch-0.9. Author: Harvey Feng hyfeng...@gmail.com Closes #385 from harveyfeng/0.9.1-ec2 and squashes the following commits:

git commit: SPARK-1202: Improvements to task killing in the UI.

2014-04-10 Thread pwendell
: Patrick Wendell pwend...@gmail.com Closes #386 from pwendell/kill-link and squashes the following commits: 8efe02b [Patrick Wendell] Improvements to task killing in the UI. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit

git commit: SPARK-1202: Improvements to task killing in the UI.

2014-04-10 Thread pwendell
. Author: Patrick Wendell pwend...@gmail.com Closes #386 from pwendell/kill-link and squashes the following commits: 8efe02b [Patrick Wendell] Improvements to task killing in the UI. (cherry picked from commit 44f654eecd3c181f2aeaff3871acf7f00eacc6b9) Signed-off-by: Patrick Wendell pwend...@gmail.com

git commit: Some clean up in build/docs

2014-04-11 Thread pwendell
pwendell/maven-clean and squashes the following commits: f0447fa [Patrick Wendell] Minor doc clean-up Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/98225a6e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/98225a6e

git commit: HOTFIX: Ignore python metastore files in RAT checks.

2014-04-11 Thread pwendell
Repository: spark Updated Branches: refs/heads/master f5ace8da3 - 6a0f8e35c HOTFIX: Ignore python metastore files in RAT checks. This was causing some errors with pull request tests. Author: Patrick Wendell pwend...@gmail.com Closes #393 from pwendell/hotfix and squashes the following

git commit: [FIX] make coalesce test deterministic in RDDSuite

2014-04-11 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 6a0f8e35c - 7038b00be [FIX] make coalesce test deterministic in RDDSuite Make coalesce test deterministic by setting pre-defined seeds. (Saw random failures in other PRs.) Author: Xiangrui Meng m...@databricks.com Closes #387 from

git commit: [WIP] [SPARK-1328] Add vector statistics

2014-04-11 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 7038b00be - fdfb45e69 [WIP] [SPARK-1328] Add vector statistics As with the new vector system in MLlib, we find that it is good to add some new APIs to precess the `RDD[Vector]`. Beside, the former implementation of `computeStat` is not

[1/3] git commit: HOTFIX: Ignore python metastore files in RAT checks.

2014-04-11 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 e6128b509 - ce0ce3d9e HOTFIX: Ignore python metastore files in RAT checks. This was causing some errors with pull request tests. Author: Patrick Wendell pwend...@gmail.com Closes #393 from pwendell/hotfix and squashes the following

[3/3] git commit: [WIP] [SPARK-1328] Add vector statistics

2014-04-11 Thread pwendell
[WIP] [SPARK-1328] Add vector statistics As with the new vector system in MLlib, we find that it is good to add some new APIs to precess the `RDD[Vector]`. Beside, the former implementation of `computeStat` is not stable which could loss precision, and has the possibility to cause `Nan` in

[2/3] git commit: [FIX] make coalesce test deterministic in RDDSuite

2014-04-11 Thread pwendell
[FIX] make coalesce test deterministic in RDDSuite Make coalesce test deterministic by setting pre-defined seeds. (Saw random failures in other PRs.) Author: Xiangrui Meng m...@databricks.com Closes #387 from mengxr/fix-random and squashes the following commits: 59bc16f [Xiangrui Meng] make

git commit: Update WindowedDStream.scala

2014-04-11 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 ce0ce3d9e - dac6240cf Update WindowedDStream.scala update the content of Exception when windowDuration is not multiple of parent.slideDuration Author: baishuo(白硕) vc_j...@hotmail.com Closes #390 from baishuo/windowdstream and

git commit: Update WindowedDStream.scala

2014-04-11 Thread pwendell
Repository: spark Updated Branches: refs/heads/master fdfb45e69 - aa8bb117a Update WindowedDStream.scala update the content of Exception when windowDuration is not multiple of parent.slideDuration Author: baishuo(白硕) vc_j...@hotmail.com Closes #390 from baishuo/windowdstream and

git commit: Update WindowedDStream.scala

2014-04-11 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-0.9 19cf2f73e - 4a325e17b Update WindowedDStream.scala update the content of Exception when windowDuration is not multiple of parent.slideDuration Author: baishuo(白硕) vc_j...@hotmail.com Closes #390 from baishuo/windowdstream and

git commit: SPARK-1057 (alternative) Remove fastutil

2014-04-11 Thread pwendell
Repository: spark Updated Branches: refs/heads/master aa8bb117a - 165e06a74 SPARK-1057 (alternative) Remove fastutil (This is for discussion at this point -- I'm not suggesting this should be committed.) This is what removing fastutil looks like. Much of it is straightforward, like using

git commit: SPARK-1057 (alternative) Remove fastutil

2014-04-11 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 dac6240cf - 4dfcb3860 SPARK-1057 (alternative) Remove fastutil (This is for discussion at this point -- I'm not suggesting this should be committed.) This is what removing fastutil looks like. Much of it is straightforward, like

git commit: [Fix #204] Update out-dated comments

2014-04-12 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 6aa08c39c - c2d160fbe [Fix #204] Update out-dated comments This PR is self-explanatory. Author: Andrew Or andrewo...@gmail.com Closes #381 from andrewor14/master and squashes the following commits: 3e8dde2 [Andrew Or] Fix comments for

git commit: [SPARK-1403] Move the class loader creation back to where it was in 0.9.0

2014-04-12 Thread pwendell
Repository: spark Updated Branches: refs/heads/master c2d160fbe - ca11919e6 [SPARK-1403] Move the class loader creation back to where it was in 0.9.0 [SPARK-1403] I investigated why spark 0.9.0 loads fine on mesos while spark 1.0.0 fails. What I found was that in SparkEnv.scala, while

git commit: [SPARK-1403] Move the class loader creation back to where it was in 0.9.0

2014-04-12 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 52d401b53 - c970d8698 [SPARK-1403] Move the class loader creation back to where it was in 0.9.0 [SPARK-1403] I investigated why spark 0.9.0 loads fine on mesos while spark 1.0.0 fails. What I found was that in SparkEnv.scala, while

git commit: SPARK-1480: Clean up use of classloaders

2014-04-13 Thread pwendell
the executor classloader did not properly delegate to the context class loader (if it is defined) and in local mode the context class loader is set by the `./spark-submit` script. A unit test is added for that case. Author: Patrick Wendell pwend...@gmail.com Closes #398 from pwendell/class-loaders

git commit: SPARK-1480: Clean up use of classloaders

2014-04-13 Thread pwendell
classloader did not properly delegate to the context class loader (if it is defined) and in local mode the context class loader is set by the `./spark-submit` script. A unit test is added for that case. Author: Patrick Wendell pwend...@gmail.com Closes #398 from pwendell/class-loaders and squashes

git commit: Small syntax error from previous backport

2014-04-13 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-0.9 4a325e17b - 9e8978903 Small syntax error from previous backport Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9e897890 Tree:

git commit: [BUGFIX] In-memory columnar storage bug fixes

2014-04-14 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 1cf565f58 - fdebb6952 [BUGFIX] In-memory columnar storage bug fixes Fixed several bugs of in-memory columnar storage to make `HiveInMemoryCompatibilitySuite` pass. @rxin @marmbrus It is reasonable to include

git commit: [BUGFIX] In-memory columnar storage bug fixes

2014-04-14 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 037fe4d2b - 7dbca68e9 [BUGFIX] In-memory columnar storage bug fixes Fixed several bugs of in-memory columnar storage to make `HiveInMemoryCompatibilitySuite` pass. @rxin @marmbrus It is reasonable to include

git commit: HOTFIX: Use file name and not paths for excludes

2014-04-14 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 7dbca68e9 - 268b53567 HOTFIX: Use file name and not paths for excludes Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/268b5356 Tree:

git commit: SPARK-1488. Resolve scalac feature warnings during build

2014-04-14 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 fdebb6952 - 74718285a SPARK-1488. Resolve scalac feature warnings during build For your consideration: scalac currently notes a number of feature warnings during compilation: ``` [warn] there were 65 feature warning(s); re-run with

git commit: SPARK-1488. Resolve scalac feature warnings during build

2014-04-14 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 268b53567 - 0247b5c54 SPARK-1488. Resolve scalac feature warnings during build For your consideration: scalac currently notes a number of feature warnings during compilation: ``` [warn] there were 65 feature warning(s); re-run with

git commit: [SPARK-1157][MLlib] L-BFGS Optimizer based on Breeze's implementation.

2014-04-15 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 692dd6936 - 5812472c1 [SPARK-1157][MLlib] L-BFGS Optimizer based on Breeze's implementation. This PR uses Breeze's L-BFGS implement, and Breeze dependency has already been introduced by Xiangrui's sparse input format work in

[2/2] git commit: Decision Tree documentation for MLlib programming guide

2014-04-15 Thread pwendell
Decision Tree documentation for MLlib programming guide Added documentation for user to use the decision tree algorithms for classification and regression in Spark 1.0 release. Apart from a general review, I need specific input on the following: * I had to move a lot of the existing

git commit: SPARK-1455: Better isolation for unit tests.

2014-04-15 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 194ed067b - 110e825aa SPARK-1455: Better isolation for unit tests. This is a simple first step towards avoiding running the Hive tests whenever possible. Author: Patrick Wendell pwend...@gmail.com Closes #420 from pwendell/test

git commit: [FIX] update sbt-idea to version 1.6.0

2014-04-15 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 110e825aa - 33d6e37cd [FIX] update sbt-idea to version 1.6.0 I saw `No scala-library*.jar in Scala compiler library` error in IDEA. It seems upgrading `sbt-idea` to 1.6.0 fixed the problem. Author: Xiangrui Meng m...@databricks.com

git commit: [FIX] update sbt-idea to version 1.6.0

2014-04-15 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 5aaf9836f - 8517911ef [FIX] update sbt-idea to version 1.6.0 I saw `No scala-library*.jar in Scala compiler library` error in IDEA. It seems upgrading `sbt-idea` to 1.6.0 fixed the problem. Author: Xiangrui Meng m...@databricks.com

[1/2] [WIP] SPARK-1430: Support sparse data in Python MLlib

2014-04-15 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 8517911ef - 63ca581d9 http://git-wip-us.apache.org/repos/asf/spark/blob/63ca581d/python/pyspark/mllib/tests.py -- diff --git a/python/pyspark/mllib/tests.py

git commit: [SPARK-959] Updated SBT from 0.13.1 to 0.13.2

2014-04-16 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 273c2fd08 - 6a10d8016 [SPARK-959] Updated SBT from 0.13.1 to 0.13.2 JIRA issue: [SPARK-959](https://spark-project.atlassian.net/browse/SPARK-959) SBT 0.13.2 has been officially released. This version updated Ivy 2.0 to Ivy 2.3, which

git commit: [SPARK-959] Updated SBT from 0.13.1 to 0.13.2

2014-04-16 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 e5130d978 - 1ea9a21f4 [SPARK-959] Updated SBT from 0.13.1 to 0.13.2 JIRA issue: [SPARK-959](https://spark-project.atlassian.net/browse/SPARK-959) SBT 0.13.2 has been officially released. This version updated Ivy 2.0 to Ivy 2.3, which

git commit: Loads test tables when running sbt hive/console without HIVE_DEV_HOME

2014-04-16 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 5fe18a74f - 9e908ab2e Loads test tables when running sbt hive/console without HIVE_DEV_HOME When running Hive tests, the working directory is `$SPARK_HOME/sql/hive`, while when running `sbt hive/console`, it becomes `$SPARK_HOME`, and

git commit: Loads test tables when running sbt hive/console without HIVE_DEV_HOME

2014-04-16 Thread pwendell
Repository: spark Updated Branches: refs/heads/master c0273d806 - fec462c15 Loads test tables when running sbt hive/console without HIVE_DEV_HOME When running Hive tests, the working directory is `$SPARK_HOME/sql/hive`, while when running `sbt hive/console`, it becomes `$SPARK_HOME`, and

git commit: update spark.default.parallelism

2014-04-16 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 9e908ab2e - e4f5577e2 update spark.default.parallelism actually, the value 8 is only valid in mesos fine-grained mode : code override def defaultParallelism() = sc.conf.getInt(spark.default.parallelism, 8) /code while in

git commit: SPARK-1469: Scheduler mode should accept lower-case definitions and have...

2014-04-16 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 82349fbd2 - e269c24db SPARK-1469: Scheduler mode should accept lower-case definitions and have... ... nicer error messages There are two improvements to Scheduler Mode: 1. Made the built in ones case insensitive (fair/FAIR, fifo/FIFO).

git commit: SPARK-1469: Scheduler mode should accept lower-case definitions and have...

2014-04-16 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 4479ecd08 - b75301f1f SPARK-1469: Scheduler mode should accept lower-case definitions and have... ... nicer error messages There are two improvements to Scheduler Mode: 1. Made the built in ones case insensitive (fair/FAIR,

[3/3] git commit: FIX: Don't build Hive in assembly unless running Hive tests.

2014-04-17 Thread pwendell
FIX: Don't build Hive in assembly unless running Hive tests. This will make the tests more stable when not running SQL tests. Author: Patrick Wendell pwend...@gmail.com Closes #439 from pwendell/hive-tests and squashes the following commits: 88a6032 [Patrick Wendell] FIX: Don't build Hive

[2/3] git commit: Add clean to build

2014-04-17 Thread pwendell
Add clean to build Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/67d01d85 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/67d01d85 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/67d01d85 Branch:

git commit: HOTFIX: Ignore streaming UI test

2014-04-17 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 322527259 - 1c0dc3733 HOTFIX: Ignore streaming UI test This is currently causing many builds to hang. https://issues.apache.org/jira/browse/SPARK-1530 Author: Patrick Wendell pwend...@gmail.com Closes #440 from pwendell/uitest-fix

[1/2] Clean up and simplify Spark configuration

2014-04-21 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 9ce6ed401 - 29ee101c7 http://git-wip-us.apache.org/repos/asf/spark/blob/29ee101c/docs/quick-start.md -- diff --git a/docs/quick-start.md b/docs/quick-start.md index

git commit: REPL cleanup.

2014-04-21 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 29ee101c7 - 8e1e7ec7a REPL cleanup. Author: Michael Armbrust mich...@databricks.com Closes #451 from marmbrus/replCleanup and squashes the following commits: 088526a [Michael Armbrust] REPL cleanup. (cherry picked from commit

git commit: [Hot Fix] Ignore org.apache.spark.ui.UISuite tests

2014-04-21 Thread pwendell
Repository: spark Updated Branches: refs/heads/master fb98488fc - af46f1fd0 [Hot Fix] Ignore org.apache.spark.ui.UISuite tests #446 faced a connection refused exception from these tests, causing them to timeout and fail after a long time. For now, let's disable these tests. (We recently

git commit: [maven-release-plugin] prepare for next development iteration

2014-04-21 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.0 6cc698fc3 - 188f7c3f6 [maven-release-plugin] prepare for next development iteration Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/188f7c3f Tree:

[2/3] git commit: Revert [maven-release-plugin] prepare release v1.0.0-rc1

2014-04-21 Thread pwendell
Revert [maven-release-plugin] prepare release v1.0.0-rc1 This reverts commit 6cc698fc378256fee9111f66c691ced27f54e973. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d778f66f Tree:

  1   2   3   4   5   6   7   8   9   10   >