spark git commit: [SPARK-4946] [CORE] Using AkkaUtils.askWithReply in MapOutputTracker.askTracker to reduce the chance of the communicating problem

2014-12-29 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 4cef05e1c -> 815de5400 [SPARK-4946] [CORE] Using AkkaUtils.askWithReply in MapOutputTracker.askTracker to reduce the chance of the communicating problem Using AkkaUtils.askWithReply in MapOutputTracker.askTracker to reduce the chance of t

spark git commit: Adde LICENSE Header to build/mvn, build/sbt and sbt/sbt

2014-12-29 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 6645e5258 -> 4cef05e1c Adde LICENSE Header to build/mvn, build/sbt and sbt/sbt Recently, build/mvn and build/sbt are added, and sbt/sbt is changed but there are no license headers. Should we add license headers to the scripts right? If it'

spark git commit: [SPARK-4982][DOC] `spark.ui.retainedJobs` description is wrong in Spark UI configuration guide

2014-12-29 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 14fa87bdf -> 6645e5258 [SPARK-4982][DOC] `spark.ui.retainedJobs` description is wrong in Spark UI configuration guide Author: wangxiaojing Closes #3818 from wangxiaojing/SPARK-4982 and squashes the following commits: fe2ad5f [wangxiaoji

spark git commit: [SPARK-4982][DOC] `spark.ui.retainedJobs` description is wrong in Spark UI configuration guide

2014-12-29 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 2cd446a90 -> 76046664d [SPARK-4982][DOC] `spark.ui.retainedJobs` description is wrong in Spark UI configuration guide Author: wangxiaojing Closes #3818 from wangxiaojing/SPARK-4982 and squashes the following commits: fe2ad5f [wangxi

spark git commit: SPARK-4971: Fix typo in BlockGenerator comment

2014-12-26 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 acf5c6328 -> 391080b68 SPARK-4971: Fix typo in BlockGenerator comment Author: CodingCat Closes #3807 from CodingCat/new_branch and squashes the following commits: 5167f01 [CodingCat] fix typo in the comment (cherry picked from commi

spark git commit: SPARK-4971: Fix typo in BlockGenerator comment

2014-12-26 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master f9ed2b664 -> fda4331d5 SPARK-4971: Fix typo in BlockGenerator comment Author: CodingCat Closes #3807 from CodingCat/new_branch and squashes the following commits: 5167f01 [CodingCat] fix typo in the comment Project: http://git-wip-us.a

spark git commit: [EC2] Update mesos/spark-ec2 branch to branch-1.3

2014-12-25 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master b6b6393b4 -> ac8278593 [EC2] Update mesos/spark-ec2 branch to branch-1.3 Going forward, we'll use matching branch names across the mesos/spark-ec2 and apache/spark repositories, per [the discussion here](https://github.com/mesos/spark-ec2

spark git commit: [EC2] Update default Spark version to 1.2.0

2014-12-25 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 08b18c7eb -> b6b6393b4 [EC2] Update default Spark version to 1.2.0 Now that 1.2.0 is out, let's update the default Spark version. Author: Nicholas Chammas Closes #3793 from nchammas/patch-1 and squashes the following commits: 3255832 [N

spark git commit: Fix "Building Spark With Maven" link in README.md

2014-12-25 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 17d6f547b -> 475ab6ec7 Fix "Building Spark With Maven" link in README.md Corrected link to the Building Spark with Maven page from its original (http://spark.apache.org/docs/latest/building-with-maven.html) to the current page (http:/

spark git commit: Fix "Building Spark With Maven" link in README.md

2014-12-25 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 11dd99317 -> 08b18c7eb Fix "Building Spark With Maven" link in README.md Corrected link to the Building Spark with Maven page from its original (http://spark.apache.org/docs/latest/building-with-maven.html) to the current page (http://spa

spark git commit: SPARK-4297 [BUILD] Build warning fixes omnibus

2014-12-24 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 199e59aac -> 29fabb1b5 SPARK-4297 [BUILD] Build warning fixes omnibus There are a number of warnings generated in a normal, successful build right now. They're mostly Java unchecked cast warnings, which can be suppressed. But there's a gr

spark git commit: [SPARK-4881][Minor] Use SparkConf#getBoolean instead of get().toBoolean

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master fd41eb957 -> 199e59aac [SPARK-4881][Minor] Use SparkConf#getBoolean instead of get().toBoolean It's really a minor issue. In ApplicationMaster, there is code like as follows. val preserveFiles = sparkConf.get("spark.yarn.preserve.stag

spark git commit: [SPARK-4860][pyspark][sql] speeding up `sample()` and `takeSample()`

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 7e2deb71c -> fd41eb957 [SPARK-4860][pyspark][sql] speeding up `sample()` and `takeSample()` This PR modifies the python `SchemaRDD` to use `sample()` and `takeSample()` from Scala instead of the slower python implementations from `rdd.py`.

spark git commit: [SPARK-4606] Send EOF to child JVM when there's no more data to read.

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 aa78c23ac -> 1a4e2ba73 [SPARK-4606] Send EOF to child JVM when there's no more data to read. Author: Marcelo Vanzin Closes #3460 from vanzin/SPARK-4606 and squashes the following commits: 031207d [Marcelo Vanzin] [SPARK-4606] Send EO

spark git commit: [SPARK-4606] Send EOF to child JVM when there's no more data to read.

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.1 b1de461a7 -> dd0287cca [SPARK-4606] Send EOF to child JVM when there's no more data to read. Author: Marcelo Vanzin Closes #3460 from vanzin/SPARK-4606 and squashes the following commits: 031207d [Marcelo Vanzin] [SPARK-4606] Send EO

spark git commit: [SPARK-4606] Send EOF to child JVM when there's no more data to read.

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 3f5f4cc4e -> 7e2deb71c [SPARK-4606] Send EOF to child JVM when there's no more data to read. Author: Marcelo Vanzin Closes #3460 from vanzin/SPARK-4606 and squashes the following commits: 031207d [Marcelo Vanzin] [SPARK-4606] Send EOF to

spark git commit: [SPARK-4913] Fix incorrect event log path

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 27c5399f4 -> 96281cd0c [SPARK-4913] Fix incorrect event log path SPARK-2261 uses a single file to log events for an app. `eventLogDir` in `ApplicationDescription` is replaced with `eventLogFile`. However, `ApplicationDescription` in `Spar

spark git commit: [SPARK-4730][YARN] Warn against deprecated YARN settings

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 7b5ba85df -> 6a46cc3c8 [SPARK-4730][YARN] Warn against deprecated YARN settings See https://issues.apache.org/jira/browse/SPARK-4730. Author: Andrew Or Closes #3590 from andrewor14/yarn-settings and squashes the following commits: 3

spark git commit: [SPARK-4730][YARN] Warn against deprecated YARN settings

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 395b771fe -> 27c5399f4 [SPARK-4730][YARN] Warn against deprecated YARN settings See https://issues.apache.org/jira/browse/SPARK-4730. Author: Andrew Or Closes #3590 from andrewor14/yarn-settings and squashes the following commits: 36e07

spark git commit: [SPARK-4914][Build] Cleans lib_managed before compiling with Hive 0.13.1

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 e74ce14e8 -> 7b5ba85df [SPARK-4914][Build] Cleans lib_managed before compiling with Hive 0.13.1 This PR tries to fix the Hive tests failure encountered in PR #3157 by cleaning `lib_managed` before building assembly jar against Hive 0.1

spark git commit: [SPARK-4914][Build] Cleans lib_managed before compiling with Hive 0.13.1

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 9c251c555 -> 395b771fe [SPARK-4914][Build] Cleans lib_managed before compiling with Hive 0.13.1 This PR tries to fix the Hive tests failure encountered in PR #3157 by cleaning `lib_managed` before building assembly jar against Hive 0.13.1

spark git commit: [SPARK-4932] Add help comments in Analytics

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 ec11ffddb -> e74ce14e8 [SPARK-4932] Add help comments in Analytics Trivial modifications for usability. Author: Takeshi Yamamuro Closes #3775 from maropu/AddHelpCommentInAnalytics and squashes the following commits: fbea8f5 [Takesh

spark git commit: [SPARK-4932] Add help comments in Analytics

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master dd155369a -> 9c251c555 [SPARK-4932] Add help comments in Analytics Trivial modifications for usability. Author: Takeshi Yamamuro Closes #3775 from maropu/AddHelpCommentInAnalytics and squashes the following commits: fbea8f5 [Takeshi Ya

spark git commit: [SPARK-4834] [standalone] Clean up application files after app finishes.

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 9fb86b80a -> ec11ffddb [SPARK-4834] [standalone] Clean up application files after app finishes. Commit 7aacb7bfa added support for sharing downloaded files among multiple executors of the same app. That works great in Yarn, since the ap

spark git commit: [SPARK-4834] [standalone] Clean up application files after app finishes.

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 2d215aeba -> dd155369a [SPARK-4834] [standalone] Clean up application files after app finishes. Commit 7aacb7bfa added support for sharing downloaded files among multiple executors of the same app. That works great in Yarn, since the app's

spark git commit: [SPARK-4931][Yarn][Docs] Fix the format of running-on-yarn.md

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 2823c7f02 -> 2d215aeba [SPARK-4931][Yarn][Docs] Fix the format of running-on-yarn.md Currently, the format about log4j in running-on-yarn.md is a bit messy. ![running-on-yarn](https://cloud.githubusercontent.com/assets/1000778/5535248/204c

spark git commit: [SPARK-4931][Yarn][Docs] Fix the format of running-on-yarn.md

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 f86fe0897 -> 9fb86b80a [SPARK-4931][Yarn][Docs] Fix the format of running-on-yarn.md Currently, the format about log4j in running-on-yarn.md is a bit messy. ![running-on-yarn](https://cloud.githubusercontent.com/assets/1000778/5535248/

spark git commit: [SPARK-4890] Ignore downloaded EC2 libs

2014-12-23 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 0e532ccb2 -> 2823c7f02 [SPARK-4890] Ignore downloaded EC2 libs PR #3737 changed `spark-ec2` to automatically download boto from PyPI. This PR tell git to ignore those downloaded library files. Author: Nicholas Chammas Closes #3770 from

spark git commit: [SPARK-4818][Core] Add 'iterator' to reduce memory consumed by join

2014-12-22 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.1 e5f27521d -> 3bce43f67 [SPARK-4818][Core] Add 'iterator' to reduce memory consumed by join In Scala, `map` and `flatMap` of `Iterable` will copy the contents of `Iterable` to a new `Seq`. Such as, ```Scala val iterable = Seq(1, 2, 3)

spark git commit: [SPARK-4818][Core] Add 'iterator' to reduce memory consumed by join

2014-12-22 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 a8a8e0e87 -> 58e37028a [SPARK-4818][Core] Add 'iterator' to reduce memory consumed by join In Scala, `map` and `flatMap` of `Iterable` will copy the contents of `Iterable` to a new `Seq`. Such as, ```Scala val iterable = Seq(1, 2, 3)

spark git commit: [SPARK-4818][Core] Add 'iterator' to reduce memory consumed by join

2014-12-22 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master de9d7d2b5 -> c233ab3d8 [SPARK-4818][Core] Add 'iterator' to reduce memory consumed by join In Scala, `map` and `flatMap` of `Iterable` will copy the contents of `Iterable` to a new `Seq`. Such as, ```Scala val iterable = Seq(1, 2, 3).map

spark git commit: SPARK-4910 [CORE] build failed (use of FileStatus.isFile in Hadoop 1.x)

2014-12-21 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master a764960b3 -> c6a3c0d50 SPARK-4910 [CORE] build failed (use of FileStatus.isFile in Hadoop 1.x) Fix small Hadoop 1 compile error from SPARK-2261. In Hadoop 1.x, all we have is FileStatus.isDir, so these "is file" assertions are changed to "

spark git commit: [Minor] Build Failed: value defaultProperties not found

2014-12-19 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.1 3597c2ebf -> e5f27521d [Minor] Build Failed: value defaultProperties not found Mvn Build Failed: value defaultProperties not found .Maybe related to this pr: https://github.com/apache/spark/commit/1d648123a77bbcd9b7a34cc0d66c14fa85edfec

spark git commit: [Minor] Build Failed: value defaultProperties not found

2014-12-19 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 96d5b00ac -> 4346a2ba1 [Minor] Build Failed: value defaultProperties not found Mvn Build Failed: value defaultProperties not found .Maybe related to this pr: https://github.com/apache/spark/commit/1d648123a77bbcd9b7a34cc0d66c14fa85edfec

spark git commit: [Minor] Build Failed: value defaultProperties not found

2014-12-19 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 15c03e1e0 -> a764960b3 [Minor] Build Failed: value defaultProperties not found Mvn Build Failed: value defaultProperties not found .Maybe related to this pr: https://github.com/apache/spark/commit/1d648123a77bbcd9b7a34cc0d66c14fa85edfecd an

spark git commit: [SPARK-4890] Upgrade Boto to 2.34.0; automatically download Boto from PyPi instead of packaging it

2014-12-19 Thread joshrosen
I've tested this with Python 2.6, too. Author: Josh Rosen Closes #3737 from JoshRosen/update-boto and squashes the following commits: 0aa43cc [Josh Rosen] Remove unused setup_standalone_cluster() method. f02935d [Josh Rosen] Enable Python deprecation warnings and fix one Boto warning:

spark git commit: [SPARK-4896] don’t redundantly overwrite executor JAR deps

2014-12-19 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.1 2d6646359 -> 546a239eb [SPARK-4896] don’t redundantly overwrite executor JAR deps Author: Ryan Williams Closes #2848 from ryan-williams/fetch-file and squashes the following commits: c14daff [Ryan Williams] Fix copy that was change

spark git commit: [SPARK-4896] don’t redundantly overwrite executor JAR deps

2014-12-19 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 6aa88cc04 -> f930fe893 [SPARK-4896] don’t redundantly overwrite executor JAR deps Author: Ryan Williams Closes #2848 from ryan-williams/fetch-file and squashes the following commits: c14daff [Ryan Williams] Fix copy that was change

spark git commit: [SPARK-4896] don’t redundantly overwrite executor JAR deps

2014-12-19 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master cdb2c645a -> 7981f9697 [SPARK-4896] don’t redundantly overwrite executor JAR deps Author: Ryan Williams Closes #2848 from ryan-williams/fetch-file and squashes the following commits: c14daff [Ryan Williams] Fix copy that was changed to

spark git commit: [Build] Remove spark-staging-1038

2014-12-19 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 5479450c4 -> 8e253ebbf [Build] Remove spark-staging-1038 Author: scwf Closes #3743 from scwf/abc and squashes the following commits: 7d98bc8 [scwf] removing spark-staging-1038 Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: [SPARK-4901] [SQL] Hot fix for ByteWritables.copyBytes

2014-12-19 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 283263ffa -> 5479450c4 [SPARK-4901] [SQL] Hot fix for ByteWritables.copyBytes HiveInspectors.scala failed in compiling with Hadoop 1, as the BytesWritable.copyBytes is not available in Hadoop 1. Author: Cheng Hao Closes #3742 from cheng

spark git commit: SPARK-3428. TaskMetrics for running tasks is missing GC time metrics

2014-12-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master d7fc69a8b -> 283263ffa SPARK-3428. TaskMetrics for running tasks is missing GC time metrics Author: Sandy Ryza Closes #3684 from sryza/sandy-spark-3428 and squashes the following commits: cb827fe [Sandy Ryza] SPARK-3428. TaskMetrics for

spark git commit: SPARK-3428. TaskMetrics for running tasks is missing GC time metrics

2014-12-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.1 f4e6ffc4c -> 2d6646359 SPARK-3428. TaskMetrics for running tasks is missing GC time metrics Author: Sandy Ryza Closes #3684 from sryza/sandy-spark-3428 and squashes the following commits: cb827fe [Sandy Ryza] SPARK-3428. TaskMetrics

spark git commit: SPARK-3428. TaskMetrics for running tasks is missing GC time metrics

2014-12-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 ca37639aa -> fd7bb9d97 SPARK-3428. TaskMetrics for running tasks is missing GC time metrics Author: Sandy Ryza Closes #3684 from sryza/sandy-spark-3428 and squashes the following commits: cb827fe [Sandy Ryza] SPARK-3428. TaskMetrics

spark git commit: [SPARK-4674] Refactor getCallSite

2014-12-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master ee1fb97a9 -> d7fc69a8b [SPARK-4674] Refactor getCallSite The current version of `getCallSite` visits the collection of `StackTraceElement` twice. However, it is unnecessary since we can perform our work with a single visit. We also do not

spark git commit: [branch-1.0][SPARK-4148][PySpark] fix seed distribution and add some tests for rdd.sample

2014-12-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.0 d2f86331d -> e0fc0c56f [branch-1.0][SPARK-4148][PySpark] fix seed distribution and add some tests for rdd.sample Port #3010 to branch-1.0. Author: Xiangrui Meng Closes #3106 from mengxr/SPARK-4148-1.0 and squashes the following comm

spark git commit: [SPARK-4837] NettyBlockTransferService should use spark.blockManager.port config

2014-12-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 61c9b89d8 -> 075b399c5 [SPARK-4837] NettyBlockTransferService should use spark.blockManager.port config This is used in NioBlockTransferService here: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/netwo

spark git commit: [SPARK-4837] NettyBlockTransferService should use spark.blockManager.port config

2014-12-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master f9f58b9a0 -> 105293a7d [SPARK-4837] NettyBlockTransferService should use spark.blockManager.port config This is used in NioBlockTransferService here: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/network/n

spark git commit: SPARK-4743 - Use SparkEnv.serializer instead of closureSerializer in aggregateByKey and foldByKey

2014-12-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master d5a596d41 -> f9f58b9a0 SPARK-4743 - Use SparkEnv.serializer instead of closureSerializer in aggregateByKey and foldByKey Author: Ivan Vergiliev Closes #3605 from IvanVergiliev/change-serializer and squashes the following commits: a49b7

spark git commit: [SPARK-4884]: Improve Partition docs

2014-12-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.0 f0eed6e30 -> d2f86331d [SPARK-4884]: Improve Partition docs Rewording was based on this discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/RDD-data-flow-td9804.html This is the associated JIRA ticket: https://issues

spark git commit: [SPARK-4884]: Improve Partition docs

2014-12-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.1 c15e7f211 -> f4e6ffc4c [SPARK-4884]: Improve Partition docs Rewording was based on this discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/RDD-data-flow-td9804.html This is the associated JIRA ticket: https://issues

spark git commit: [SPARK-4884]: Improve Partition docs

2014-12-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master a7ed6f3cc -> d5a596d41 [SPARK-4884]: Improve Partition docs Rewording was based on this discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/RDD-data-flow-td9804.html This is the associated JIRA ticket: https://issues.apa

spark git commit: [SPARK-4884]: Improve Partition docs

2014-12-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 e7f9dd5cd -> 61c9b89d8 [SPARK-4884]: Improve Partition docs Rewording was based on this discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/RDD-data-flow-td9804.html This is the associated JIRA ticket: https://issues

spark git commit: [SPARK-4841] fix zip with textFile()

2014-12-17 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 76c88c668 -> 0429ec308 [SPARK-4841] fix zip with textFile() UTF8Deserializer can not be used in BatchedSerializer, so always use PickleSerializer() when change batchSize in zip(). Also, if two RDD have the same batch size already, the

spark git commit: SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions

2014-12-17 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 e63516855 -> 76c88c668 SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions This looked like perhaps a simple and important one. `combineByKey` looks like it should clean its arguments' closures, and that in turn covers

spark git commit: [SPARK-4772] Clear local copies of accumulators as soon as we're done with them

2014-12-17 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 0ebbccb87 -> e63516855 [SPARK-4772] Clear local copies of accumulators as soon as we're done with them Accumulators keep thread-local copies of themselves. These copies were only cleared at the beginning of a task. This meant that (a

spark git commit: [SPARK-4714] BlockManager.dropFromMemory() should check whether block has been removed after synchronizing on BlockInfo instance.

2014-12-17 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 51081e42b -> 0ebbccb87 [SPARK-4714] BlockManager.dropFromMemory() should check whether block has been removed after synchronizing on BlockInfo instance. After synchronizing on the `info` lock in the `removeBlock`/`dropOldBlocks`/`drop

spark git commit: [SPARK-4691][shuffle] Restructure a few lines in shuffle code

2014-12-17 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 26dfac6e9 -> 51081e42b [SPARK-4691][shuffle] Restructure a few lines in shuffle code In HashShuffleReader.scala and HashShuffleWriter.scala, no need to judge "dep.aggregator.isEmpty" again as this is judged by "dep.aggregator.isDefine

spark git commit: SPARK-3926 [CORE] Reopened: result of JavaRDD collectAsMap() is not serializable

2014-12-17 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 7ecf30e35 -> 26dfac6e9 SPARK-3926 [CORE] Reopened: result of JavaRDD collectAsMap() is not serializable My original 'fix' didn't fix at all. Now, there's a unit test to check whether it works. Of the two options to really fix it -- cop

spark git commit: [SPARK-4750] Dynamic allocation - synchronize kills

2014-12-17 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 e1d839e96 -> 7ecf30e35 [SPARK-4750] Dynamic allocation - synchronize kills Simple omission on my part. Author: Andrew Or Closes #3612 from andrewor14/dynamic-allocation-synchronization and squashes the following commits: 1f03b60 [A

spark git commit: [SPARK-4764] Ensure that files are fetched atomically

2014-12-17 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 2f00a29d4 -> e1d839e96 [SPARK-4764] Ensure that files are fetched atomically tempFile is created in the same directory than targetFile, so that the move from tempFile to targetFile is always atomic Author: Christophe Préaud Closes #

spark git commit: [SPARK-4595][Core] Fix MetricsServlet not work issue

2014-12-17 Thread joshrosen
sai Shao Author: Josh Rosen Author: jerryshao Closes #3444 from jerryshao/SPARK-4595 and squashes the following commits: 434d17e [Saisai Shao] Merge pull request #10 from JoshRosen/metrics-system-cleanup 87a2292 [Josh Rosen] Guard against misuse of MetricsSystem methods. f779fe0 [jerryshao]

spark git commit: [SPARK-4595][Core] Fix MetricsServlet not work issue

2014-12-17 Thread joshrosen
hao Author: Josh Rosen Author: jerryshao Closes #3444 from jerryshao/SPARK-4595 and squashes the following commits: 434d17e [Saisai Shao] Merge pull request #10 from JoshRosen/metrics-system-cleanup 87a2292 [Josh Rosen] Guard against misuse of MetricsSystem methods. f779fe0 [jerryshao]

spark git commit: [HOTFIX] Fix RAT exclusion for known_translations file

2014-12-16 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.1 0efd691d9 -> c15e7f211 [HOTFIX] Fix RAT exclusion for known_translations file Author: Josh Rosen Closes #3719 from JoshRosen/rat-fix and squashes the following commits: 1542886 [Josh Rosen] [HOTFIX] Fix RAT exclusion

spark git commit: [HOTFIX] Fix RAT exclusion for known_translations file

2014-12-16 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 beb75aca6 -> b5919d1b5 [HOTFIX] Fix RAT exclusion for known_translations file Author: Josh Rosen Closes #3719 from JoshRosen/rat-fix and squashes the following commits: 1542886 [Josh Rosen] [HOTFIX] Fix RAT exclusion

spark git commit: [HOTFIX] Fix RAT exclusion for known_translations file

2014-12-16 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 4e1112e7b -> 3d0c37b81 [HOTFIX] Fix RAT exclusion for known_translations file Author: Josh Rosen Closes #3719 from JoshRosen/rat-fix and squashes the following commits: 1542886 [Josh Rosen] [HOTFIX] Fix RAT exclusion

spark git commit: SPARK-4767: Add support for launching in a specified placement group to spark_ec2

2014-12-16 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 6530243a5 -> b0dfdbdd1 SPARK-4767: Add support for launching in a specified placement group to spark_ec2 Placement groups are cool and all the cool kids are using them. Lets add support for them to spark_ec2.py because I'm lazy Author: H

spark git commit: [SPARK-3405] add subnet-id and vpc-id options to spark_ec2.py

2014-12-16 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master cb4844749 -> d12c0711f [SPARK-3405] add subnet-id and vpc-id options to spark_ec2.py Based on this gist: https://gist.github.com/amar-analytx/0b62543621e1f246c0a2 We use security group ids instead of security group to get around this issue

spark git commit: [SPARK-4437] update doc for WholeCombineFileRecordReader

2014-12-16 Thread joshrosen
its: 1d7422f [Davies Liu] Merge pull request #2 from JoshRosen/whole-text-file-cleanup dc3d21a [Josh Rosen] More genericization in ConfigurableCombineFileRecordReader. 95d13eb [Davies Liu] address comment bf800b9 [Davies Liu] update doc for WholeCombineFileRecordReader Project: http://git-

spark git commit: [SPARK-4841] fix zip with textFile()

2014-12-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master c7628771d -> c246b95dd [SPARK-4841] fix zip with textFile() UTF8Deserializer can not be used in BatchedSerializer, so always use PickleSerializer() when change batchSize in zip(). Also, if two RDD have the same batch size already, they di

spark git commit: [SPARK-4792] Add error message when making local dir unsuccessfully

2014-12-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 81112e4b5 -> c7628771d [SPARK-4792] Add error message when making local dir unsuccessfully Author: meiyoula <1039320...@qq.com> Closes #3635 from XuTingjun/master and squashes the following commits: dd1c66d [meiyoula] when old is deleted,

spark git commit: SPARK-4814 [CORE] Enable assertions in SBT, Maven tests / AssertionError from Hive's LazyBinaryInteger

2014-12-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.1 fa3b3e384 -> 892685b37 SPARK-4814 [CORE] Enable assertions in SBT, Maven tests / AssertionError from Hive's LazyBinaryInteger This enables assertions for the Maven and SBT build, but overrides the Hive module to not enable assertions.

spark git commit: SPARK-4814 [CORE] Enable assertions in SBT, Maven tests / AssertionError from Hive's LazyBinaryInteger

2014-12-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 f1f27ec9c -> 6bd8a9666 SPARK-4814 [CORE] Enable assertions in SBT, Maven tests / AssertionError from Hive's LazyBinaryInteger This enables assertions for the Maven and SBT build, but overrides the Hive module to not enable assertions.

spark git commit: SPARK-4814 [CORE] Enable assertions in SBT, Maven tests / AssertionError from Hive's LazyBinaryInteger

2014-12-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 5c24759dd -> 81112e4b5 SPARK-4814 [CORE] Enable assertions in SBT, Maven tests / AssertionError from Hive's LazyBinaryInteger This enables assertions for the Maven and SBT build, but overrides the Hive module to not enable assertions. Au

spark git commit: [Minor][Core] fix comments in MapOutputTracker

2014-12-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 ec1917540 -> f1f27ec9c [Minor][Core] fix comments in MapOutputTracker Using driver and executor in the comments of ```MapOutputTracker``` is more clear. Author: wangfei Closes #3700 from scwf/commentFix and squashes the following co

spark git commit: [Minor][Core] fix comments in MapOutputTracker

2014-12-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 2a28bc610 -> 5c24759dd [Minor][Core] fix comments in MapOutputTracker Using driver and executor in the comments of ```MapOutputTracker``` is more clear. Author: wangfei Closes #3700 from scwf/commentFix and squashes the following commit

spark git commit: SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions

2014-12-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-0.9 b06001482 -> 63c0ff992 SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions This looked like perhaps a simple and important one. `combineByKey` looks like it should clean its arguments' closures, and that in turn covers

spark git commit: SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions

2014-12-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.0 f96363469 -> b9b6762f1 SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions This looked like perhaps a simple and important one. `combineByKey` looks like it should clean its arguments' closures, and that in turn covers

spark git commit: SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions

2014-12-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.1 0faea1761 -> fa3b3e384 SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions This looked like perhaps a simple and important one. `combineByKey` looks like it should clean its arguments' closures, and that in turn covers

spark git commit: SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions

2014-12-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 8176b7a02 -> 2a28bc610 SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions This looked like perhaps a simple and important one. `combineByKey` looks like it should clean its arguments' closures, and that in turn covers app

spark git commit: [SPARK-1037] The name of findTaskFromList & findTask in TaskSetManager.scala is confusing

2014-12-15 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master f6b8591a0 -> 38703bbca [SPARK-1037] The name of findTaskFromList & findTask in TaskSetManager.scala is confusing Hi all - I've renamed the methods referenced in this JIRA to clarify that they modify the provided arrays (find vs. deque).

spark git commit: [SPARK-4826] Fix generation of temp file names in WAL tests

2014-12-15 Thread joshrosen
log suites. Closes #3695. Closes #3701. Author: Josh Rosen Closes #3704 from JoshRosen/SPARK-4826 and squashes the following commits: f2307f5 [Josh Rosen] Use Spark Utils class for directory creation/deletion a693ddb [Josh Rosen] remove unused Random import b275e41 [Josh Rosen] Move creation of t

spark git commit: [SPARK-4826] Fix generation of temp file names in WAL tests

2014-12-15 Thread joshrosen
tes. Closes #3695. Closes #3701. Author: Josh Rosen Closes #3704 from JoshRosen/SPARK-4826 and squashes the following commits: f2307f5 [Josh Rosen] Use Spark Utils class for directory creation/deletion a693ddb [Josh Rosen] remove unused Random import b275e41 [Josh Rosen] Move creation of temp. dir

spark git commit: fixed spelling errors in documentation

2014-12-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.0 9c213160e -> f96363469 fixed spelling errors in documentation changed "form" to "from" in 3 documentation entries for Kafka integration Author: Peter Klipfel Closes #3691 from peterklipfel/master and squashes the following commits:

spark git commit: fixed spelling errors in documentation

2014-12-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.1 396de67fc -> 0faea1761 fixed spelling errors in documentation changed "form" to "from" in 3 documentation entries for Kafka integration Author: Peter Klipfel Closes #3691 from peterklipfel/master and squashes the following commits:

spark git commit: fixed spelling errors in documentation

2014-12-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 c82e99d87 -> 6eec4bc3b fixed spelling errors in documentation changed "form" to "from" in 3 documentation entries for Kafka integration Author: Peter Klipfel Closes #3691 from peterklipfel/master and squashes the following commits:

spark git commit: fixed spelling errors in documentation

2014-12-14 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master ef84dab8c -> 2a2983f7c fixed spelling errors in documentation changed "form" to "from" in 3 documentation entries for Kafka integration Author: Peter Klipfel Closes #3691 from peterklipfel/master and squashes the following commits: 0fe7

spark git commit: [CORE]codeStyle: uniform ConcurrentHashMap define in StorageLevel.scala with other places

2014-12-10 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 652b781a9 -> 57d37f9c7 [CORE]codeStyle: uniform ConcurrentHashMap define in StorageLevel.scala with other places Author: Zhang, Liye Closes #2793 from liyezhang556520/uniformHashMap and squashes the following commits: 5884735 [Zhang, L

spark git commit: [SPARK-4772] Clear local copies of accumulators as soon as we're done with them

2014-12-10 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.0 8ce676047 -> 3425ba815 [SPARK-4772] Clear local copies of accumulators as soon as we're done with them Accumulators keep thread-local copies of themselves. These copies were only cleared at the beginning of a task. This meant that (a

spark git commit: [SPARK-4772] Clear local copies of accumulators as soon as we're done with them

2014-12-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.1 9b9923744 -> 6dcafa7ce [SPARK-4772] Clear local copies of accumulators as soon as we're done with them Accumulators keep thread-local copies of themselves. These copies were only cleared at the beginning of a task. This meant that (a

spark git commit: [SPARK-4772] Clear local copies of accumulators as soon as we're done with them

2014-12-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master f79c1cfc9 -> 94b377f94 [SPARK-4772] Clear local copies of accumulators as soon as we're done with them Accumulators keep thread-local copies of themselves. These copies were only cleared at the beginning of a task. This meant that (a) th

spark git commit: [Minor] Use tag for help icon in web UI page header

2014-12-09 Thread joshrosen
ses #3659 from JoshRosen/webui-help-sup and squashes the following commits: bd72899 [Josh Rosen] Use tag for help icon in web UI page header. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f79c1cfc Tree: http://git-

spark git commit: [Minor] Use tag for help icon in web UI page header

2014-12-09 Thread joshrosen
ses #3659 from JoshRosen/webui-help-sup and squashes the following commits: bd72899 [Josh Rosen] Use tag for help icon in web UI page header. (cherry picked from commit f79c1cfc997c1a7ddee480ca3d46f5341b69d3b7) Signed-off-by: Josh Rosen Project: http://git-wip-us.apache.org/repos/asf/spark/r

spark git commit: SPARK-4567. Make SparkJobInfo and SparkStageInfo serializable

2014-12-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 5a3a3cc17 -> 51da2c557 SPARK-4567. Make SparkJobInfo and SparkStageInfo serializable Author: Sandy Ryza Closes #3426 from sryza/sandy-spark-4567 and squashes the following commits: cb4b8d2 [Sandy Ryza] SPARK-4567. Make SparkJobInfo a

spark git commit: SPARK-4567. Make SparkJobInfo and SparkStageInfo serializable

2014-12-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 30dca924d -> 5e4c06f8e SPARK-4567. Make SparkJobInfo and SparkStageInfo serializable Author: Sandy Ryza Closes #3426 from sryza/sandy-spark-4567 and squashes the following commits: cb4b8d2 [Sandy Ryza] SPARK-4567. Make SparkJobInfo and S

spark git commit: [SPARK-4714] BlockManager.dropFromMemory() should check whether block has been removed after synchronizing on BlockInfo instance.

2014-12-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.0 c118e0f23 -> 8ce676047 [SPARK-4714] BlockManager.dropFromMemory() should check whether block has been removed after synchronizing on BlockInfo instance. After synchronizing on the `info` lock in the `removeBlock`/`dropOldBlocks`/`drop

spark git commit: [SPARK-4714] BlockManager.dropFromMemory() should check whether block has been removed after synchronizing on BlockInfo instance.

2014-12-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.1 fe7d7a983 -> 9b9923744 [SPARK-4714] BlockManager.dropFromMemory() should check whether block has been removed after synchronizing on BlockInfo instance. After synchronizing on the `info` lock in the `removeBlock`/`dropOldBlocks`/`drop

spark git commit: [SPARK-4714] BlockManager.dropFromMemory() should check whether block has been removed after synchronizing on BlockInfo instance.

2014-12-09 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 1f5110630 -> 30dca924d [SPARK-4714] BlockManager.dropFromMemory() should check whether block has been removed after synchronizing on BlockInfo instance. After synchronizing on the `info` lock in the `removeBlock`/`dropOldBlocks`/`dropFrom

spark git commit: [SQL] remove unnecessary import in spark-sql

2014-12-08 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master cda94d15e -> 944384363 [SQL] remove unnecessary import in spark-sql Author: Jacky Li Closes #3630 from jackylk/remove and squashes the following commits: 150e7e0 [Jacky Li] remove unnecessary import Project: http://git-wip-us.apache.or

spark git commit: SPARK-4770. [DOC] [YARN] spark.scheduler.minRegisteredResourcesRatio doc...

2014-12-08 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 9ed5641a5 -> f4160324c SPARK-4770. [DOC] [YARN] spark.scheduler.minRegisteredResourcesRatio doc... ...umented default is incorrect for YARN Author: Sandy Ryza Closes #3624 from sryza/sandy-spark-4770 and squashes the following commit

<    3   4   5   6   7   8   9   10   11   12   >