git commit: [SPARK-3597][Mesos] Implement `killTask`.
Repository: spark Updated Branches: refs/heads/master cf1d32e3e - 32fad4233 [SPARK-3597][Mesos] Implement `killTask`. The MesosSchedulerBackend did not previously implement `killTask`, resulting in an exception. Author: Brenden Matthews bren...@diddyinc.com Closes #2453 from brndnmtthws/implement-killtask and squashes the following commits: 23ddcdc [Brenden Matthews] [SPARK-3597][Mesos] Implement `killTask`. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/32fad423 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/32fad423 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/32fad423 Branch: refs/heads/master Commit: 32fad4233f353814496c84e15ba64326730b7ae7 Parents: cf1d32e Author: Brenden Matthews bren...@diddyinc.com Authored: Sun Oct 5 09:49:24 2014 -0700 Committer: Andrew Or andrewo...@gmail.com Committed: Sun Oct 5 09:49:24 2014 -0700 -- .../spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala | 7 +++ 1 file changed, 7 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/32fad423/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala -- diff --git a/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala b/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala index b117863..e0f2fd6 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala @@ -372,6 +372,13 @@ private[spark] class MesosSchedulerBackend( recordSlaveLost(d, slaveId, ExecutorExited(status)) } + override def killTask(taskId: Long, executorId: String, interruptThread: Boolean): Unit = { +driver.killTask( + TaskID.newBuilder() +.setValue(taskId.toString).build() +) + } + // TODO: query Mesos for number of cores override def defaultParallelism() = sc.conf.getInt(spark.default.parallelism, 8) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
git commit: [SPARK-3597][Mesos] Implement `killTask`.
Repository: spark Updated Branches: refs/heads/branch-1.1 e4ddedee6 - d9cf4d08a [SPARK-3597][Mesos] Implement `killTask`. The MesosSchedulerBackend did not previously implement `killTask`, resulting in an exception. Author: Brenden Matthews bren...@diddyinc.com Closes #2453 from brndnmtthws/implement-killtask and squashes the following commits: 23ddcdc [Brenden Matthews] [SPARK-3597][Mesos] Implement `killTask`. (cherry picked from commit 32fad4233f353814496c84e15ba64326730b7ae7) Signed-off-by: Andrew Or andrewo...@gmail.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d9cf4d08 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d9cf4d08 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d9cf4d08 Branch: refs/heads/branch-1.1 Commit: d9cf4d08ae392cc840fac21ba153fdf9d9219782 Parents: e4ddede Author: Brenden Matthews bren...@diddyinc.com Authored: Sun Oct 5 09:49:24 2014 -0700 Committer: Andrew Or andrewo...@gmail.com Committed: Sun Oct 5 09:49:35 2014 -0700 -- .../spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala | 7 +++ 1 file changed, 7 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/d9cf4d08/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala -- diff --git a/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala b/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala index 06f2c09..8f064bf 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala @@ -369,6 +369,13 @@ private[spark] class MesosSchedulerBackend( recordSlaveLost(d, slaveId, ExecutorExited(status)) } + override def killTask(taskId: Long, executorId: String, interruptThread: Boolean): Unit = { +driver.killTask( + TaskID.newBuilder() +.setValue(taskId.toString).build() +) + } + // TODO: query Mesos for number of cores override def defaultParallelism() = sc.conf.getInt(spark.default.parallelism, 8) } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
git commit: SPARK-1656: Fix potential resource leaks
Repository: spark Updated Branches: refs/heads/master 32fad4233 - a7c73130f SPARK-1656: Fix potential resource leaks JIRA: https://issues.apache.org/jira/browse/SPARK-1656 Author: zsxwing zsxw...@gmail.com Closes #577 from zsxwing/SPARK-1656 and squashes the following commits: c431095 [zsxwing] Add a comment and fix the code style 2de96e5 [zsxwing] Make sure file will be deleted if exception happens 28b90dc [zsxwing] Update to follow the code style 4521d6e [zsxwing] Merge branch 'master' into SPARK-1656 afc3383 [zsxwing] Update to follow the code style 071fdd1 [zsxwing] SPARK-1656: Fix potential resource leaks Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a7c73130 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a7c73130 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a7c73130 Branch: refs/heads/master Commit: a7c73130f1b6b0b8b19a7b0a0de5c713b673cd7b Parents: 32fad42 Author: zsxwing zsxw...@gmail.com Authored: Sun Oct 5 09:55:17 2014 -0700 Committer: Andrew Or andrewo...@gmail.com Committed: Sun Oct 5 09:56:23 2014 -0700 -- .../apache/spark/broadcast/HttpBroadcast.scala | 25 .../master/FileSystemPersistenceEngine.scala| 14 +++ .../org/apache/spark/storage/DiskStore.scala| 16 - 3 files changed, 40 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/a7c73130/core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala -- diff --git a/core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala b/core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala index 942dc7d..4cd4f4f 100644 --- a/core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala +++ b/core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala @@ -163,18 +163,23 @@ private[broadcast] object HttpBroadcast extends Logging { private def write(id: Long, value: Any) { val file = getFile(id) -val out: OutputStream = { - if (compress) { -compressionCodec.compressedOutputStream(new FileOutputStream(file)) - } else { -new BufferedOutputStream(new FileOutputStream(file), bufferSize) +val fileOutputStream = new FileOutputStream(file) +try { + val out: OutputStream = { +if (compress) { + compressionCodec.compressedOutputStream(fileOutputStream) +} else { + new BufferedOutputStream(fileOutputStream, bufferSize) +} } + val ser = SparkEnv.get.serializer.newInstance() + val serOut = ser.serializeStream(out) + serOut.writeObject(value) + serOut.close() + files += file +} finally { + fileOutputStream.close() } -val ser = SparkEnv.get.serializer.newInstance() -val serOut = ser.serializeStream(out) -serOut.writeObject(value) -serOut.close() -files += file } private def read[T: ClassTag](id: Long): T = { http://git-wip-us.apache.org/repos/asf/spark/blob/a7c73130/core/src/main/scala/org/apache/spark/deploy/master/FileSystemPersistenceEngine.scala -- diff --git a/core/src/main/scala/org/apache/spark/deploy/master/FileSystemPersistenceEngine.scala b/core/src/main/scala/org/apache/spark/deploy/master/FileSystemPersistenceEngine.scala index aa85aa0..08a99bb 100644 --- a/core/src/main/scala/org/apache/spark/deploy/master/FileSystemPersistenceEngine.scala +++ b/core/src/main/scala/org/apache/spark/deploy/master/FileSystemPersistenceEngine.scala @@ -83,15 +83,21 @@ private[spark] class FileSystemPersistenceEngine( val serialized = serializer.toBinary(value) val out = new FileOutputStream(file) -out.write(serialized) -out.close() +try { + out.write(serialized) +} finally { + out.close() +} } def deserializeFromFile[T](file: File)(implicit m: Manifest[T]): T = { val fileData = new Array[Byte](file.length().asInstanceOf[Int]) val dis = new DataInputStream(new FileInputStream(file)) -dis.readFully(fileData) -dis.close() +try { + dis.readFully(fileData) +} finally { + dis.close() +} val clazz = m.runtimeClass.asInstanceOf[Class[T]] val serializer = serialization.serializerFor(clazz) http://git-wip-us.apache.org/repos/asf/spark/blob/a7c73130/core/src/main/scala/org/apache/spark/storage/DiskStore.scala -- diff --git a/core/src/main/scala/org/apache/spark/storage/DiskStore.scala b/core/src/main/scala/org/apache/spark/storage/DiskStore.scala index e9304f6..bac459e 100644 --- a/core/src/main/scala/org/apache/spark/storage/DiskStore.scala +++
git commit: SPARK-1656: Fix potential resource leaks
Repository: spark Updated Branches: refs/heads/branch-1.1 d9cf4d08a - c068d9084 SPARK-1656: Fix potential resource leaks JIRA: https://issues.apache.org/jira/browse/SPARK-1656 Author: zsxwing zsxw...@gmail.com Closes #577 from zsxwing/SPARK-1656 and squashes the following commits: c431095 [zsxwing] Add a comment and fix the code style 2de96e5 [zsxwing] Make sure file will be deleted if exception happens 28b90dc [zsxwing] Update to follow the code style 4521d6e [zsxwing] Merge branch 'master' into SPARK-1656 afc3383 [zsxwing] Update to follow the code style 071fdd1 [zsxwing] SPARK-1656: Fix potential resource leaks (cherry picked from commit a7c73130f1b6b0b8b19a7b0a0de5c713b673cd7b) Signed-off-by: Andrew Or andrewo...@gmail.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c068d908 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c068d908 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c068d908 Branch: refs/heads/branch-1.1 Commit: c068d9084c94cdd1aeee2c6ad6f55148bc4527ce Parents: d9cf4d0 Author: zsxwing zsxw...@gmail.com Authored: Sun Oct 5 09:55:17 2014 -0700 Committer: Andrew Or andrewo...@gmail.com Committed: Sun Oct 5 09:56:32 2014 -0700 -- .../apache/spark/broadcast/HttpBroadcast.scala | 25 .../master/FileSystemPersistenceEngine.scala| 14 +++ .../org/apache/spark/storage/DiskStore.scala| 16 - 3 files changed, 40 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/c068d908/core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala -- diff --git a/core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala b/core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala index 942dc7d..4cd4f4f 100644 --- a/core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala +++ b/core/src/main/scala/org/apache/spark/broadcast/HttpBroadcast.scala @@ -163,18 +163,23 @@ private[broadcast] object HttpBroadcast extends Logging { private def write(id: Long, value: Any) { val file = getFile(id) -val out: OutputStream = { - if (compress) { -compressionCodec.compressedOutputStream(new FileOutputStream(file)) - } else { -new BufferedOutputStream(new FileOutputStream(file), bufferSize) +val fileOutputStream = new FileOutputStream(file) +try { + val out: OutputStream = { +if (compress) { + compressionCodec.compressedOutputStream(fileOutputStream) +} else { + new BufferedOutputStream(fileOutputStream, bufferSize) +} } + val ser = SparkEnv.get.serializer.newInstance() + val serOut = ser.serializeStream(out) + serOut.writeObject(value) + serOut.close() + files += file +} finally { + fileOutputStream.close() } -val ser = SparkEnv.get.serializer.newInstance() -val serOut = ser.serializeStream(out) -serOut.writeObject(value) -serOut.close() -files += file } private def read[T: ClassTag](id: Long): T = { http://git-wip-us.apache.org/repos/asf/spark/blob/c068d908/core/src/main/scala/org/apache/spark/deploy/master/FileSystemPersistenceEngine.scala -- diff --git a/core/src/main/scala/org/apache/spark/deploy/master/FileSystemPersistenceEngine.scala b/core/src/main/scala/org/apache/spark/deploy/master/FileSystemPersistenceEngine.scala index aa85aa0..08a99bb 100644 --- a/core/src/main/scala/org/apache/spark/deploy/master/FileSystemPersistenceEngine.scala +++ b/core/src/main/scala/org/apache/spark/deploy/master/FileSystemPersistenceEngine.scala @@ -83,15 +83,21 @@ private[spark] class FileSystemPersistenceEngine( val serialized = serializer.toBinary(value) val out = new FileOutputStream(file) -out.write(serialized) -out.close() +try { + out.write(serialized) +} finally { + out.close() +} } def deserializeFromFile[T](file: File)(implicit m: Manifest[T]): T = { val fileData = new Array[Byte](file.length().asInstanceOf[Int]) val dis = new DataInputStream(new FileInputStream(file)) -dis.readFully(fileData) -dis.close() +try { + dis.readFully(fileData) +} finally { + dis.close() +} val clazz = m.runtimeClass.asInstanceOf[Class[T]] val serializer = serialization.serializerFor(clazz) http://git-wip-us.apache.org/repos/asf/spark/blob/c068d908/core/src/main/scala/org/apache/spark/storage/DiskStore.scala -- diff --git a/core/src/main/scala/org/apache/spark/storage/DiskStore.scala
git commit: [SPARK-3007][SQL] Fixes dynamic partitioning support for lower Hadoop versions
Repository: spark Updated Branches: refs/heads/master a7c73130f - 1b97a941a [SPARK-3007][SQL] Fixes dynamic partitioning support for lower Hadoop versions This is a follow up of #2226 and #2616 to fix Jenkins master SBT build failures for lower Hadoop versions (1.0.x and 2.0.x). The root cause is the semantics difference of `FileSystem.globStatus()` between different versions of Hadoop, as illustrated by the following test code: ```scala object GlobExperiments extends App { val conf = new Configuration() val fs = FileSystem.getLocal(conf) fs.globStatus(new Path(/tmp/wh/*/*/*)).foreach { status = println(status.getPath) } } ``` Target directory structure: ``` /tmp/wh âââ dir0 â  âââ dir1 â  â  âââ level2 â  âââ level1 âââ level0 ``` Hadoop 2.4.1 result: ``` file:/tmp/wh/dir0/dir1/level2 ``` Hadoop 1.0.4 resuet: ``` file:/tmp/wh/dir0/dir1/level2 file:/tmp/wh/dir0/level1 file:/tmp/wh/level0 ``` In #2226 and #2616, we call `FileOutputCommitter.commitJob()` at the end of the job, and the `_SUCCESS` mark file is written. When working with lower Hadoop versions, due to the `globStatus()` semantics issue, `_SUCCESS` is included as a separate partition data file by `Hive.loadDynamicPartitions()`, and fails partition spec checking. The fix introduced in this PR is kind of a hack: when inserting data with dynamic partitioning, we intentionally avoid writing the `_SUCCESS` marker to workaround this issue. Hive doesn't suffer this issue because `FileSinkOperator` doesn't call `FileOutputCommitter.commitJob()`, instead, it calls `Utilities.mvFileToFinalPath()` to cleanup the output directory and then loads it into Hive warehouse by with `loadDynamicPartitions()`/`loadPartition()`/`loadTable()`. This approach is better because it handles failed job and speculative tasks properly. We should add this step to `InsertIntoHiveTable` in another PR. Author: Cheng Lian lian.cs@gmail.com Closes #2663 from liancheng/dp-hadoop-1-fix and squashes the following commits: 0177dae [Cheng Lian] Fixes dynamic partitioning support for lower Hadoop versions Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1b97a941 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1b97a941 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1b97a941 Branch: refs/heads/master Commit: 1b97a941a09a2f63d442f435c1b444d857cd6956 Parents: a7c7313 Author: Cheng Lian lian.cs@gmail.com Authored: Sun Oct 5 11:19:17 2014 -0700 Committer: Michael Armbrust mich...@databricks.com Committed: Sun Oct 5 11:19:17 2014 -0700 -- .../spark/sql/hive/hiveWriterContainers.scala | 26 +--- 1 file changed, 22 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/1b97a941/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveWriterContainers.scala -- diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveWriterContainers.scala b/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveWriterContainers.scala index ac5c7a8..6ccbc22 100644 --- a/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveWriterContainers.scala +++ b/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveWriterContainers.scala @@ -55,8 +55,8 @@ private[hive] class SparkHiveWriterContainer( private var taID: SerializableWritable[TaskAttemptID] = null @transient private var writer: FileSinkOperator.RecordWriter = null - @transient private lazy val committer = conf.value.getOutputCommitter - @transient private lazy val jobContext = newJobContext(conf.value, jID.value) + @transient protected lazy val committer = conf.value.getOutputCommitter + @transient protected lazy val jobContext = newJobContext(conf.value, jID.value) @transient private lazy val taskContext = newTaskAttemptContext(conf.value, taID.value) @transient private lazy val outputFormat = conf.value.getOutputFormat.asInstanceOf[HiveOutputFormat[AnyRef,Writable]] @@ -122,8 +122,6 @@ private[hive] class SparkHiveWriterContainer( } } - // * Private Functions * - private def setIDs(jobId: Int, splitId: Int, attemptId: Int) { jobID = jobId splitID = splitId @@ -157,12 +155,18 @@ private[hive] object SparkHiveWriterContainer { } } +private[spark] object SparkHiveDynamicPartitionWriterContainer { + val SUCCESSFUL_JOB_OUTPUT_DIR_MARKER = mapreduce.fileoutputcommitter.marksuccessfuljobs +} + private[spark] class SparkHiveDynamicPartitionWriterContainer( @transient jobConf: JobConf, fileSinkConf: FileSinkDesc, dynamicPartColNames: Array[String]) extends SparkHiveWriterContainer(jobConf, fileSinkConf) { + import
git commit: HOTFIX: Fix unicode error in merge script.
Repository: spark Updated Branches: refs/heads/master 1b97a941a - e21e2 HOTFIX: Fix unicode error in merge script. The merge script builds up a big command array and sometimes this contains both unicode and ascii strings. This doesn't work if you try to join them into a single string. Longer term a solution is to go and make sure the source of all strings is unicode. This patch provides a simpler solution... just print the array rather than joining. I actually prefer printing an array here anyways since joining on spaces is lossy in the case of arguments that themselves contain spaces. Author: Patrick Wendell pwend...@gmail.com Closes #2645 from pwendell/merge-script and squashes the following commits: 167b792 [Patrick Wendell] HOTFIX: Fix unicode error in merge script. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e21e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e21e Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e21e Branch: refs/heads/master Commit: e21e24c122300bbde6d5ec4002a7c42b2e24 Parents: 1b97a94 Author: Patrick Wendell pwend...@gmail.com Authored: Sun Oct 5 13:22:40 2014 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Sun Oct 5 13:22:40 2014 -0700 -- dev/merge_spark_pr.py | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/e21e/dev/merge_spark_pr.py -- diff --git a/dev/merge_spark_pr.py b/dev/merge_spark_pr.py index a8e92e3..02ac209 100755 --- a/dev/merge_spark_pr.py +++ b/dev/merge_spark_pr.py @@ -73,11 +73,10 @@ def fail(msg): def run_cmd(cmd): +print cmd if isinstance(cmd, list): -print .join(cmd) return subprocess.check_output(cmd) else: -print cmd return subprocess.check_output(cmd.split( )) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
git commit: [SPARK-3792][SQL] Enable JavaHiveQLSuite
Repository: spark Updated Branches: refs/heads/master 79b2108de - 58f5361ca [SPARK-3792][SQL] Enable JavaHiveQLSuite Do not use TestSQLContext in JavaHiveQLSuite, that may lead to two SparkContexts in one jvm and enable JavaHiveQLSuite Author: scwf wangf...@huawei.com Closes #2652 from scwf/fix-JavaHiveQLSuite and squashes the following commits: be35c91 [scwf] enable JavaHiveQLSuite Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/58f5361c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/58f5361c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/58f5361c Branch: refs/heads/master Commit: 58f5361caaa2f898e38ae4b3794167881e20a818 Parents: 79b2108 Author: scwf wangf...@huawei.com Authored: Sun Oct 5 17:47:20 2014 -0700 Committer: Michael Armbrust mich...@databricks.com Committed: Sun Oct 5 17:49:41 2014 -0700 -- .../sql/hive/api/java/JavaHiveQLSuite.scala | 27 +++- 1 file changed, 9 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/58f5361c/sql/hive/src/test/scala/org/apache/spark/sql/hive/api/java/JavaHiveQLSuite.scala -- diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/api/java/JavaHiveQLSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/api/java/JavaHiveQLSuite.scala index 9644b70..46b11b5 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/api/java/JavaHiveQLSuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/api/java/JavaHiveQLSuite.scala @@ -25,34 +25,30 @@ import org.apache.spark.api.java.JavaSparkContext import org.apache.spark.sql.api.java.JavaSchemaRDD import org.apache.spark.sql.execution.ExplainCommand import org.apache.spark.sql.hive.test.TestHive -import org.apache.spark.sql.test.TestSQLContext // Implicits import scala.collection.JavaConversions._ class JavaHiveQLSuite extends FunSuite { - lazy val javaCtx = new JavaSparkContext(TestSQLContext.sparkContext) + lazy val javaCtx = new JavaSparkContext(TestHive.sparkContext) // There is a little trickery here to avoid instantiating two HiveContexts in the same JVM lazy val javaHiveCtx = new JavaHiveContext(javaCtx) { override val sqlContext = TestHive } - ignore(SELECT * FROM src) { + test(SELECT * FROM src) { assert( javaHiveCtx.sql(SELECT * FROM src).collect().map(_.getInt(0)) === TestHive.sql(SELECT * FROM src).collect().map(_.getInt(0)).toSeq) } - private val explainCommandClassName = -classOf[ExplainCommand].getSimpleName.stripSuffix($) - def isExplanation(result: JavaSchemaRDD) = { val explanation = result.collect().map(_.getString(0)) -explanation.size 1 explanation.head.startsWith(explainCommandClassName) +explanation.size 1 explanation.head.startsWith(== Physical Plan ==) } - ignore(Query Hive native command execution result) { + test(Query Hive native command execution result) { val tableName = test_native_commands assertResult(0) { @@ -63,23 +59,18 @@ class JavaHiveQLSuite extends FunSuite { javaHiveCtx.sql(sCREATE TABLE $tableName(key INT, value STRING)).count() } -javaHiveCtx.sql(SHOW TABLES).registerTempTable(show_tables) - assert( javaHiveCtx -.sql(SELECT result FROM show_tables) +.sql(SHOW TABLES) .collect() .map(_.getString(0)) .contains(tableName)) -assertResult(Array(Array(key, int, None), Array(value, string, None))) { - javaHiveCtx.sql(sDESCRIBE $tableName).registerTempTable(describe_table) - - +assertResult(Array(Array(key, int), Array(value, string))) { javaHiveCtx -.sql(SELECT result FROM describe_table) +.sql(sdescribe $tableName) .collect() -.map(_.getString(0).split(\t).map(_.trim)) +.map(row = Array(row.get(0).asInstanceOf[String], row.get(1).asInstanceOf[String])) .toArray } @@ -89,7 +80,7 @@ class JavaHiveQLSuite extends FunSuite { TestHive.reset() } - ignore(Exactly once semantics for DDL and command statements) { + test(Exactly once semantics for DDL and command statements) { val tableName = test_exactly_once val q0 = javaHiveCtx.sql(sCREATE TABLE $tableName(key INT, value STRING)) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
git commit: [SPARK-3792][SQL] Enable JavaHiveQLSuite
Repository: spark Updated Branches: refs/heads/branch-1.1 c068d9084 - 964e3aa48 [SPARK-3792][SQL] Enable JavaHiveQLSuite Do not use TestSQLContext in JavaHiveQLSuite, that may lead to two SparkContexts in one jvm and enable JavaHiveQLSuite Author: scwf wangf...@huawei.com Closes #2652 from scwf/fix-JavaHiveQLSuite and squashes the following commits: be35c91 [scwf] enable JavaHiveQLSuite (cherry picked from commit 58f5361caaa2f898e38ae4b3794167881e20a818) Signed-off-by: Michael Armbrust mich...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/964e3aa4 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/964e3aa4 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/964e3aa4 Branch: refs/heads/branch-1.1 Commit: 964e3aa4800a31037e00b533f965b0f162d29e67 Parents: c068d90 Author: scwf wangf...@huawei.com Authored: Sun Oct 5 17:47:20 2014 -0700 Committer: Michael Armbrust mich...@databricks.com Committed: Sun Oct 5 17:50:11 2014 -0700 -- .../sql/hive/api/java/JavaHiveQLSuite.scala | 27 +++- 1 file changed, 9 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/964e3aa4/sql/hive/src/test/scala/org/apache/spark/sql/hive/api/java/JavaHiveQLSuite.scala -- diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/api/java/JavaHiveQLSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/api/java/JavaHiveQLSuite.scala index 9644b70..46b11b5 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/api/java/JavaHiveQLSuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/api/java/JavaHiveQLSuite.scala @@ -25,34 +25,30 @@ import org.apache.spark.api.java.JavaSparkContext import org.apache.spark.sql.api.java.JavaSchemaRDD import org.apache.spark.sql.execution.ExplainCommand import org.apache.spark.sql.hive.test.TestHive -import org.apache.spark.sql.test.TestSQLContext // Implicits import scala.collection.JavaConversions._ class JavaHiveQLSuite extends FunSuite { - lazy val javaCtx = new JavaSparkContext(TestSQLContext.sparkContext) + lazy val javaCtx = new JavaSparkContext(TestHive.sparkContext) // There is a little trickery here to avoid instantiating two HiveContexts in the same JVM lazy val javaHiveCtx = new JavaHiveContext(javaCtx) { override val sqlContext = TestHive } - ignore(SELECT * FROM src) { + test(SELECT * FROM src) { assert( javaHiveCtx.sql(SELECT * FROM src).collect().map(_.getInt(0)) === TestHive.sql(SELECT * FROM src).collect().map(_.getInt(0)).toSeq) } - private val explainCommandClassName = -classOf[ExplainCommand].getSimpleName.stripSuffix($) - def isExplanation(result: JavaSchemaRDD) = { val explanation = result.collect().map(_.getString(0)) -explanation.size 1 explanation.head.startsWith(explainCommandClassName) +explanation.size 1 explanation.head.startsWith(== Physical Plan ==) } - ignore(Query Hive native command execution result) { + test(Query Hive native command execution result) { val tableName = test_native_commands assertResult(0) { @@ -63,23 +59,18 @@ class JavaHiveQLSuite extends FunSuite { javaHiveCtx.sql(sCREATE TABLE $tableName(key INT, value STRING)).count() } -javaHiveCtx.sql(SHOW TABLES).registerTempTable(show_tables) - assert( javaHiveCtx -.sql(SELECT result FROM show_tables) +.sql(SHOW TABLES) .collect() .map(_.getString(0)) .contains(tableName)) -assertResult(Array(Array(key, int, None), Array(value, string, None))) { - javaHiveCtx.sql(sDESCRIBE $tableName).registerTempTable(describe_table) - - +assertResult(Array(Array(key, int), Array(value, string))) { javaHiveCtx -.sql(SELECT result FROM describe_table) +.sql(sdescribe $tableName) .collect() -.map(_.getString(0).split(\t).map(_.trim)) +.map(row = Array(row.get(0).asInstanceOf[String], row.get(1).asInstanceOf[String])) .toArray } @@ -89,7 +80,7 @@ class JavaHiveQLSuite extends FunSuite { TestHive.reset() } - ignore(Exactly once semantics for DDL and command statements) { + test(Exactly once semantics for DDL and command statements) { val tableName = test_exactly_once val q0 = javaHiveCtx.sql(sCREATE TABLE $tableName(key INT, value STRING)) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
git commit: [SPARK-3645][SQL] Makes table caching eager by default and adds syntax for lazy caching
Repository: spark Updated Branches: refs/heads/master 58f5361ca - 34b97a067 [SPARK-3645][SQL] Makes table caching eager by default and adds syntax for lazy caching Although lazy caching for in-memory table seems consistent with the `RDD.cache()` API, it's relatively confusing for users who mainly work with SQL and not familiar with Spark internals. The `CACHE TABLE t; SELECT COUNT(*) FROM t;` pattern is also commonly seen just to ensure predictable performance. This PR makes both the `CACHE TABLE t [AS SELECT ...]` statement and the `SQLContext.cacheTable()` API eager by default, and adds a new `CACHE LAZY TABLE t [AS SELECT ...]` syntax to provide lazy in-memory table caching. Also, took the chance to make some refactoring: `CacheCommand` and `CacheTableAsSelectCommand` are now merged and renamed to `CacheTableCommand` since the former is strictly a special case of the latter. A new `UncacheTableCommand` is added for the `UNCACHE TABLE t` statement. Author: Cheng Lian lian.cs@gmail.com Closes #2513 from liancheng/eager-caching and squashes the following commits: fe92287 [Cheng Lian] Makes table caching eager by default and adds syntax for lazy caching Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/34b97a06 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/34b97a06 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/34b97a06 Branch: refs/heads/master Commit: 34b97a067d1b370fbed8ecafab2f48501a35d783 Parents: 58f5361 Author: Cheng Lian lian.cs@gmail.com Authored: Sun Oct 5 17:51:59 2014 -0700 Committer: Michael Armbrust mich...@databricks.com Committed: Sun Oct 5 17:51:59 2014 -0700 -- .../apache/spark/sql/catalyst/SqlParser.scala | 45 +++--- .../spark/sql/catalyst/analysis/Catalog.scala | 2 +- .../sql/catalyst/plans/logical/commands.scala | 15 +- .../org/apache/spark/sql/CacheManager.scala | 9 +- .../columnar/InMemoryColumnarTableScan.scala| 2 +- .../spark/sql/execution/SparkStrategies.scala | 8 +- .../apache/spark/sql/execution/commands.scala | 47 +++--- .../org/apache/spark/sql/CachedTableSuite.scala | 145 +-- .../spark/sql/hive/ExtendedHiveQlParser.scala | 66 - .../org/apache/spark/sql/hive/TestHive.scala| 6 +- .../spark/sql/hive/CachedTableSuite.scala | 78 +++--- 11 files changed, 265 insertions(+), 158 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/34b97a06/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala index 2633633..854b5b4 100755 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala @@ -67,11 +67,12 @@ class SqlParser extends StandardTokenParsers with PackratParsers { protected implicit def asParser(k: Keyword): Parser[String] = lexical.allCaseVersions(k.str).map(x = x : Parser[String]).reduce(_ | _) + protected val ABS = Keyword(ABS) protected val ALL = Keyword(ALL) protected val AND = Keyword(AND) + protected val APPROXIMATE = Keyword(APPROXIMATE) protected val AS = Keyword(AS) protected val ASC = Keyword(ASC) - protected val APPROXIMATE = Keyword(APPROXIMATE) protected val AVG = Keyword(AVG) protected val BETWEEN = Keyword(BETWEEN) protected val BY = Keyword(BY) @@ -80,9 +81,9 @@ class SqlParser extends StandardTokenParsers with PackratParsers { protected val COUNT = Keyword(COUNT) protected val DESC = Keyword(DESC) protected val DISTINCT = Keyword(DISTINCT) + protected val EXCEPT = Keyword(EXCEPT) protected val FALSE = Keyword(FALSE) protected val FIRST = Keyword(FIRST) - protected val LAST = Keyword(LAST) protected val FROM = Keyword(FROM) protected val FULL = Keyword(FULL) protected val GROUP = Keyword(GROUP) @@ -91,42 +92,42 @@ class SqlParser extends StandardTokenParsers with PackratParsers { protected val IN = Keyword(IN) protected val INNER = Keyword(INNER) protected val INSERT = Keyword(INSERT) + protected val INTERSECT = Keyword(INTERSECT) protected val INTO = Keyword(INTO) protected val IS = Keyword(IS) protected val JOIN = Keyword(JOIN) + protected val LAST = Keyword(LAST) + protected val LAZY = Keyword(LAZY) protected val LEFT = Keyword(LEFT) + protected val LIKE = Keyword(LIKE) protected val LIMIT = Keyword(LIMIT) + protected val LOWER = Keyword(LOWER) protected val MAX = Keyword(MAX) protected val MIN = Keyword(MIN) protected val NOT = Keyword(NOT) protected val NULL =
git commit: [SPARK-3776][SQL] Wrong conversion to Catalyst for Option[Product]
Repository: spark Updated Branches: refs/heads/master 34b97a067 - 90897ea5f [SPARK-3776][SQL] Wrong conversion to Catalyst for Option[Product] Author: Renat Yusupov re.yusu...@2gis.ru Closes #2641 from r3natko/feature/catalyst_option and squashes the following commits: 55d0c06 [Renat Yusupov] [SQL] SPARK-3776: Wrong conversion to Catalyst for Option[Product] Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/90897ea5 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/90897ea5 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/90897ea5 Branch: refs/heads/master Commit: 90897ea5f24b03c9f3455a62c7f68b3d3f0435ad Parents: 34b97a0 Author: Renat Yusupov re.yusu...@2gis.ru Authored: Sun Oct 5 17:56:24 2014 -0700 Committer: Michael Armbrust mich...@databricks.com Committed: Sun Oct 5 17:56:34 2014 -0700 -- .../spark/sql/catalyst/ScalaReflection.scala| 2 +- .../sql/catalyst/ScalaReflectionSuite.scala | 21 +--- 2 files changed, 19 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/90897ea5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala -- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala index 88a8fa7..b3ae8e6 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala @@ -33,7 +33,7 @@ object ScalaReflection { /** Converts Scala objects to catalyst rows / types */ def convertToCatalyst(a: Any): Any = a match { -case o: Option[_] = o.orNull +case o: Option[_] = o.map(convertToCatalyst).orNull case s: Seq[_] = s.map(convertToCatalyst) case m: Map[_, _] = m.map { case (k, v) = convertToCatalyst(k) - convertToCatalyst(v) } case p: Product = new GenericRow(p.productIterator.map(convertToCatalyst).toArray) http://git-wip-us.apache.org/repos/asf/spark/blob/90897ea5/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala -- diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala index 428607d..488e373 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala @@ -53,7 +53,8 @@ case class OptionalData( floatField: Option[Float], shortField: Option[Short], byteField: Option[Byte], -booleanField: Option[Boolean]) +booleanField: Option[Boolean], +structField: Option[PrimitiveData]) case class ComplexData( arrayField: Seq[Int], @@ -100,7 +101,7 @@ class ScalaReflectionSuite extends FunSuite { nullable = true)) } - test(optinal data) { + test(optional data) { val schema = schemaFor[OptionalData] assert(schema === Schema( StructType(Seq( @@ -110,7 +111,8 @@ class ScalaReflectionSuite extends FunSuite { StructField(floatField, FloatType, nullable = true), StructField(shortField, ShortType, nullable = true), StructField(byteField, ByteType, nullable = true), -StructField(booleanField, BooleanType, nullable = true))), +StructField(booleanField, BooleanType, nullable = true), +StructField(structField, schemaFor[PrimitiveData].dataType, nullable = true))), nullable = true)) } @@ -228,4 +230,17 @@ class ScalaReflectionSuite extends FunSuite { assert(ArrayType(IntegerType) === typeOfObject3(Seq(1, 2, 3))) assert(ArrayType(ArrayType(IntegerType)) === typeOfObject3(Seq(Seq(1,2,3 } + + test(convert PrimitiveData to catalyst) { +val data = PrimitiveData(1, 1, 1, 1, 1, 1, true) +val convertedData = Seq(1, 1.toLong, 1.toDouble, 1.toFloat, 1.toShort, 1.toByte, true) +assert(convertToCatalyst(data) === convertedData) + } + + test(convert Option[Product] to catalyst) { +val primitiveData = PrimitiveData(1, 1, 1, 1, 1, 1, true) +val data = OptionalData(Some(1), Some(1), Some(1), Some(1), Some(1), Some(1), Some(true), Some(primitiveData)) +val convertedData = Seq(1, 1.toLong, 1.toDouble, 1.toFloat, 1.toShort, 1.toByte, true, convertToCatalyst(primitiveData)) +assert(convertToCatalyst(data) === convertedData) + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For
git commit: SPARK-3794 [CORE] Building spark core fails due to inadvertent dependency on Commons IO
Repository: spark Updated Branches: refs/heads/master 90897ea5f - 8d22dbb5e SPARK-3794 [CORE] Building spark core fails due to inadvertent dependency on Commons IO Remove references to Commons IO FileUtils and replace with pure Java version, which doesn't need to traverse the whole directory tree first. I think this method could be refined further if it would be alright to rename it and its args and break it down into two methods. I'm starting with a simple recursive rendition. Author: Sean Owen so...@cloudera.com Closes #2662 from srowen/SPARK-3794 and squashes the following commits: 4cd172f [Sean Owen] Remove references to Commons IO FileUtils and replace with pure Java version, which doesn't need to traverse the whole directory tree first Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8d22dbb5 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8d22dbb5 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8d22dbb5 Branch: refs/heads/master Commit: 8d22dbb5ec7a0727afdfebbbc2c57ffdb384dd0b Parents: 90897ea Author: Sean Owen so...@cloudera.com Authored: Sun Oct 5 18:44:12 2014 -0700 Committer: Michael Armbrust mich...@databricks.com Committed: Sun Oct 5 18:44:12 2014 -0700 -- .../org/apache/spark/deploy/worker/Worker.scala | 1 - .../scala/org/apache/spark/util/Utils.scala | 20 ++-- 2 files changed, 10 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/8d22dbb5/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala -- diff --git a/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala b/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala index 3b13f43..9b52cb0 100755 --- a/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala +++ b/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala @@ -29,7 +29,6 @@ import scala.language.postfixOps import akka.actor._ import akka.remote.{DisassociatedEvent, RemotingLifecycleEvent} -import org.apache.commons.io.FileUtils import org.apache.spark.{Logging, SecurityManager, SparkConf, SparkException} import org.apache.spark.deploy.{ExecutorDescription, ExecutorState} http://git-wip-us.apache.org/repos/asf/spark/blob/8d22dbb5/core/src/main/scala/org/apache/spark/util/Utils.scala -- diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala b/core/src/main/scala/org/apache/spark/util/Utils.scala index a671241..3d307b3 100644 --- a/core/src/main/scala/org/apache/spark/util/Utils.scala +++ b/core/src/main/scala/org/apache/spark/util/Utils.scala @@ -35,8 +35,6 @@ import scala.util.control.{ControlThrowable, NonFatal} import com.google.common.io.Files import com.google.common.util.concurrent.ThreadFactoryBuilder -import org.apache.commons.io.FileUtils -import org.apache.commons.io.filefilter.TrueFileFilter import org.apache.commons.lang3.SystemUtils import org.apache.hadoop.conf.Configuration import org.apache.log4j.PropertyConfigurator @@ -710,18 +708,20 @@ private[spark] object Utils extends Logging { * Determines if a directory contains any files newer than cutoff seconds. * * @param dir must be the path to a directory, or IllegalArgumentException is thrown - * @param cutoff measured in seconds. Returns true if there are any files in dir newer than this. + * @param cutoff measured in seconds. Returns true if there are any files or directories in the + * given directory whose last modified time is later than this many seconds ago */ def doesDirectoryContainAnyNewFiles(dir: File, cutoff: Long): Boolean = { -val currentTimeMillis = System.currentTimeMillis if (!dir.isDirectory) { - throw new IllegalArgumentException (dir + is not a directory!) -} else { - val files = FileUtils.listFilesAndDirs(dir, TrueFileFilter.TRUE, TrueFileFilter.TRUE) - val cutoffTimeInMillis = (currentTimeMillis - (cutoff * 1000)) - val newFiles = files.filter { _.lastModified cutoffTimeInMillis } - newFiles.nonEmpty + throw new IllegalArgumentException($dir is not a directory!) } +val filesAndDirs = dir.listFiles() +val cutoffTimeInMillis = System.currentTimeMillis - (cutoff * 1000) + +filesAndDirs.exists(_.lastModified() cutoffTimeInMillis) || +filesAndDirs.filter(_.isDirectory).exists( + subdir = doesDirectoryContainAnyNewFiles(subdir, cutoff) +) } /** - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
git commit: Rectify gereneric parameter names between SparkContext and AccumulablePa...
Repository: spark Updated Branches: refs/heads/master 8d22dbb5e - fd7b15539 Rectify gereneric parameter names between SparkContext and AccumulablePa... AccumulableParam gave its generic parameters as 'R, T', whereas SparkContext labeled them 'T, R'. Trivial, but really confusing. I resolved this in favor of AccumulableParam, because it seemed to have some logic for its names. I also extended this minimal, but at least present, justification into the SparkContext comments. Author: Nathan Kronenfeld nkronenf...@oculusinfo.com Closes #2637 from nkronenfeld/accumulators and squashes the following commits: 98d6b74 [Nathan Kronenfeld] Rectify gereneric parameter names between SparkContext and AccumulableParam Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fd7b1553 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fd7b1553 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/fd7b1553 Branch: refs/heads/master Commit: fd7b15539669b14996a51610d6724ca0811f9d65 Parents: 8d22dbb Author: Nathan Kronenfeld nkronenf...@oculusinfo.com Authored: Sun Oct 5 21:03:48 2014 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Sun Oct 5 21:03:48 2014 -0700 -- core/src/main/scala/org/apache/spark/SparkContext.scala | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/fd7b1553/core/src/main/scala/org/apache/spark/SparkContext.scala -- diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala b/core/src/main/scala/org/apache/spark/SparkContext.scala index 97109b9..396cdd1 100644 --- a/core/src/main/scala/org/apache/spark/SparkContext.scala +++ b/core/src/main/scala/org/apache/spark/SparkContext.scala @@ -779,20 +779,20 @@ class SparkContext(config: SparkConf) extends Logging { /** * Create an [[org.apache.spark.Accumulable]] shared variable, to which tasks can add values * with `+=`. Only the driver can access the accumuable's `value`. - * @tparam T accumulator type - * @tparam R type that can be added to the accumulator + * @tparam R accumulator result type + * @tparam T type that can be added to the accumulator */ - def accumulable[T, R](initialValue: T)(implicit param: AccumulableParam[T, R]) = + def accumulable[R, T](initialValue: R)(implicit param: AccumulableParam[R, T]) = new Accumulable(initialValue, param) /** * Create an [[org.apache.spark.Accumulable]] shared variable, with a name for display in the * Spark UI. Tasks can add values to the accumuable using the `+=` operator. Only the driver can * access the accumuable's `value`. - * @tparam T accumulator type - * @tparam R type that can be added to the accumulator + * @tparam R accumulator result type + * @tparam T type that can be added to the accumulator */ - def accumulable[T, R](initialValue: T, name: String)(implicit param: AccumulableParam[T, R]) = + def accumulable[R, T](initialValue: R, name: String)(implicit param: AccumulableParam[R, T]) = new Accumulable(initialValue, param, Some(name)) /** - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
git commit: [SPARK-3765][Doc] Add test information to sbt build docs
Repository: spark Updated Branches: refs/heads/master fd7b15539 - c9ae79fba [SPARK-3765][Doc] Add test information to sbt build docs Add testing with sbt to doc ```building-spark.md``` Author: scwf wangf...@huawei.com Closes #2629 from scwf/sbt-doc and squashes the following commits: fd9cf29 [scwf] add testing with sbt to docs Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c9ae79fb Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c9ae79fb Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c9ae79fb Branch: refs/heads/master Commit: c9ae79fba25cd49ca70ca398bc75434202d26a97 Parents: fd7b155 Author: scwf wangf...@huawei.com Authored: Sun Oct 5 21:36:20 2014 -0700 Committer: Patrick Wendell pwend...@gmail.com Committed: Sun Oct 5 21:36:28 2014 -0700 -- docs/building-spark.md | 15 +++ 1 file changed, 15 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/c9ae79fb/docs/building-spark.md -- diff --git a/docs/building-spark.md b/docs/building-spark.md index 901c157..b2940ee 100644 --- a/docs/building-spark.md +++ b/docs/building-spark.md @@ -171,6 +171,21 @@ can be set to control the SBT build. For example: sbt/sbt -Pyarn -Phadoop-2.3 assembly +# Testing with SBT + +Some of the tests require Spark to be packaged first, so always run `sbt/sbt assembly` the first time. The following is an example of a correct (build, test) sequence: + +sbt/sbt -Pyarn -Phadoop-2.3 -Phive assembly +sbt/sbt -Pyarn -Phadoop-2.3 -Phive test + +To run only a specific test suite as follows: + +sbt/sbt -Pyarn -Phadoop-2.3 -Phive test-only org.apache.spark.repl.ReplSuite + +To run test suites of a specific sub project as follows: + +sbt/sbt -Pyarn -Phadoop-2.3 -Phive core/test + # Speeding up Compilation with Zinc [Zinc](https://github.com/typesafehub/zinc) is a long-running server version of SBT's incremental - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org