git commit: SPARK-1203 fix saving to hdfs from yarn

2014-03-19 Thread tgraves
Repository: spark
Updated Branches:
  refs/heads/branch-0.9 d385b5a19 - 250ec271c


SPARK-1203 fix saving to hdfs from yarn

Author: Thomas Graves tgra...@apache.org

Closes #173 from tgravescs/SPARK-1203 and squashes the following commits:

4fd5ded [Thomas Graves] adding import
964e3f7 [Thomas Graves] SPARK-1203 fix saving to hdfs from yarn


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/250ec271
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/250ec271
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/250ec271

Branch: refs/heads/branch-0.9
Commit: 250ec271c6136b881f7ffcb9ce746c30d4abbf3c
Parents: d385b5a
Author: Thomas Graves tgra...@apache.org
Authored: Wed Mar 19 08:09:20 2014 -0500
Committer: Thomas Graves tgra...@apache.org
Committed: Wed Mar 19 08:19:47 2014 -0500

--
 core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala | 2 ++
 1 file changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/250ec271/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
--
diff --git a/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala 
b/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
index 0b2917b..0bc09b3 100644
--- a/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
@@ -44,6 +44,7 @@ import 
com.clearspring.analytics.stream.cardinality.HyperLogLog
 import org.apache.hadoop.mapred.SparkHadoopWriter
 import org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil
 import org.apache.spark._
+import org.apache.spark.deploy.SparkHadoopUtil
 import org.apache.spark.SparkContext._
 import org.apache.spark.partial.{BoundedDouble, PartialResult}
 import org.apache.spark.Partitioner.defaultPartitioner
@@ -710,6 +711,7 @@ class PairRDDFunctions[K: ClassTag, V: ClassTag](self: 
RDD[(K, V)])
 if (valueClass == null) {
   throw new SparkException(Output value class not set)
 }
+SparkHadoopUtil.get.addCredentials(conf)
 
 logDebug(Saving as hadoop file of type ( + keyClass.getSimpleName + ,  
+
   valueClass.getSimpleName+ ))



git commit: [SPARK-1275] Made dev/run-tests executable.

2014-03-19 Thread tdas
Repository: spark
Updated Branches:
  refs/heads/branch-0.9 72875b29b - a4eef655c


[SPARK-1275] Made dev/run-tests executable.

This was causing Jenkins tests to fail for PRs against branch 0.9.

Author: Tathagata Das tathagata.das1...@gmail.com

Closes #178 from tdas/branch-0.9-fix and squashes the following commits:

a633bce [Tathagata Das] Merge remote-tracking branch 'apache-github/branch-0.9' 
into branch-0.9-fix
9b043cc [Tathagata Das] Made dev/run-tests executable.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a4eef655
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a4eef655
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a4eef655

Branch: refs/heads/branch-0.9
Commit: a4eef655c7263f73db6f2c837288982140f17a23
Parents: 72875b2
Author: Tathagata Das tathagata.das1...@gmail.com
Authored: Wed Mar 19 16:10:45 2014 -0700
Committer: Tathagata Das tathagata.das1...@gmail.com
Committed: Wed Mar 19 16:10:45 2014 -0700

--
 dev/run-tests | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a4eef655/dev/run-tests
--
diff --git a/dev/run-tests b/dev/run-tests
old mode 100644
new mode 100755



git commit: SPARK-1099:Spark's local mode should probably respect spark.cores.max by default

2014-03-19 Thread adav
Repository: spark
Updated Branches:
  refs/heads/master 67fa71cba - 16789317a


SPARK-1099:Spark's local mode should probably respect spark.cores.max by default

This is for JIRA:https://spark-project.atlassian.net/browse/SPARK-1099
And this is what I do in this patch (also commented in the JIRA) @aarondav

 This is really a behavioral change, so I do this with great caution, and 
welcome any review advice:

1 I change the MASTER=local pattern of create LocalBackEnd . In the past, we 
passed 1 core to it . now it use a default cores
The reason here is that when someone use spark-shell to start local mode , Repl 
will use this MASTER=local pattern as default.
So if one also specify cores in the spark-shell command line, it will all go in 
here. So here pass 1 core is not suitalbe reponding to our change here.
2 In the LocalBackEnd , the totalCores variable are fetched following a 
different rule(in the past it just take in a userd passed cores, like 1 in 
MASTER=local pattern, 2 in MASTER=local[2] pattern
rules:
a The second argument of LocalBackEnd 's constructor indicating cores have a 
default value which is Int.MaxValue. If user didn't pass it , its first default 
value is Int.MaxValue
b In getMaxCores, we first compare the former value to Int.MaxValue. if it's 
not equal, we think that user has passed their desired value, so just use it
c. If b is not satified, we then get cores from spark.cores.max, and we get 
real logical cores from Runtime. And if cores specified by spark.cores.max is 
bigger than logical cores, we use logical cores, otherwise we use 
spark.cores.max
3 In SparkContextSchedulerCreationSuite 's test(local) case, assertion is 
modified from 1 to logical cores, because MASTER=local pattern use default 
vaules.

Author: qqsun8819 jin@alibaba-inc.com

Closes #110 from qqsun8819/local-cores and squashes the following commits:

731aefa [qqsun8819] 1 LocalBackend not change 2 In SparkContext do some process 
to the cores and pass it to original LocalBackend constructor
78b9c60 [qqsun8819] 1 SparkContext MASTER=local pattern use default cores 
instead of 1 to construct LocalBackEnd , for use of spark-shell and cores 
specified in cmd line 2 some test case change from local to local[1]. 3 
SparkContextSchedulerCreationSuite test spark.cores.max config in local pattern
6ae1ee8 [qqsun8819] Add a static function in LocalBackEnd to let it use 
spark.cores.max specified cores when no cores are passed to it


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/16789317
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/16789317
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/16789317

Branch: refs/heads/master
Commit: 16789317a34c1974f7b35960f06a7b51d8e0f29f
Parents: 67fa71c
Author: qqsun8819 jin@alibaba-inc.com
Authored: Wed Mar 19 16:33:54 2014 -0700
Committer: Aaron Davidson aa...@databricks.com
Committed: Wed Mar 19 16:33:54 2014 -0700

--
 .../scala/org/apache/spark/SparkContext.scala|  5 -
 .../test/scala/org/apache/spark/FileSuite.scala  |  4 ++--
 .../SparkContextSchedulerCreationSuite.scala | 19 ---
 3 files changed, 22 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/16789317/core/src/main/scala/org/apache/spark/SparkContext.scala
--
diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala 
b/core/src/main/scala/org/apache/spark/SparkContext.scala
index a1003b7..8f74607 100644
--- a/core/src/main/scala/org/apache/spark/SparkContext.scala
+++ b/core/src/main/scala/org/apache/spark/SparkContext.scala
@@ -1262,7 +1262,10 @@ object SparkContext extends Logging {
 master match {
   case local =
 val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, 
isLocal = true)
-val backend = new LocalBackend(scheduler, 1)
+// Use user specified in config, up to all available cores
+val realCores = Runtime.getRuntime.availableProcessors()
+val toUseCores = math.min(sc.conf.getInt(spark.cores.max, 
realCores), realCores)
+val backend = new LocalBackend(scheduler, toUseCores)
 scheduler.initialize(backend)
 scheduler
 

http://git-wip-us.apache.org/repos/asf/spark/blob/16789317/core/src/test/scala/org/apache/spark/FileSuite.scala
--
diff --git a/core/src/test/scala/org/apache/spark/FileSuite.scala 
b/core/src/test/scala/org/apache/spark/FileSuite.scala
index 01af940..b4a5881 100644
--- a/core/src/test/scala/org/apache/spark/FileSuite.scala
+++ b/core/src/test/scala/org/apache/spark/FileSuite.scala
@@ -34,7 +34,7 @@ import org.apache.spark.SparkContext._
 class FileSuite extends FunSuite with