spark git commit: [SPARK-8644] Include call site in SparkException stack traces thrown by job failures

2015-07-16 Thread adav
Repository: spark Updated Branches: refs/heads/master 031d7d414 - 57e9b13bf [SPARK-8644] Include call site in SparkException stack traces thrown by job failures Example exception (new part at bottom, clearly demarcated): ``` org.apache.spark.SparkException: Job aborted due to stage failure:

spark git commit: [SPARK-7183] [NETWORK] Fix memory leak of TransportRequestHandler.streamIds

2015-05-01 Thread adav
Repository: spark Updated Branches: refs/heads/master 1262e310c - 168603272 [SPARK-7183] [NETWORK] Fix memory leak of TransportRequestHandler.streamIds JIRA: https://issues.apache.org/jira/browse/SPARK-7183 Author: Liang-Chi Hsieh vii...@gmail.com Closes #5743 from

spark git commit: [SPARK-7003] Improve reliability of connection failure detection between Netty block transfer service endpoints

2015-04-20 Thread adav
Repository: spark Updated Branches: refs/heads/master 1be207078 - 968ad9721 [SPARK-7003] Improve reliability of connection failure detection between Netty block transfer service endpoints Currently we rely on the assumption that an exception will be raised and the channel closed if two

spark git commit: [Minor] [SQL] [SPARK-6729] Minor fix for DriverQuirks get

2015-04-06 Thread adav
Repository: spark Updated Branches: refs/heads/master 30363ede8 - e40ea8742 [Minor] [SQL] [SPARK-6729] Minor fix for DriverQuirks get The function uses .substring(0, X), which will trigger OutOfBoundsException if string length is less than X. A better way to do this is to use startsWith,

spark git commit: [SPARK-6122][Core] Upgrade Tachyon client version to 0.6.1.

2015-03-22 Thread adav
Repository: spark Updated Branches: refs/heads/master 6ef48632f - a41b9c600 [SPARK-6122][Core] Upgrade Tachyon client version to 0.6.1. Changes the Tachyon client version from 0.5 to 0.6 in spark core and distribution script. New dependencies in Tachyon 0.6.0 include

spark git commit: [SPARK-4012] stop SparkContext when the exception is thrown from an infinite loop

2015-03-19 Thread adav
Repository: spark Updated Branches: refs/heads/master 645cf3fcc - 2c3f83c34 [SPARK-4012] stop SparkContext when the exception is thrown from an infinite loop https://issues.apache.org/jira/browse/SPARK-4012 This patch is a resubmission for https://github.com/apache/spark/pull/2864 What I

spark git commit: [SPARK-6330] Fix filesystem bug in newParquet relation

2015-03-16 Thread adav
Repository: spark Updated Branches: refs/heads/master 12a345adc - d19efeddc [SPARK-6330] Fix filesystem bug in newParquet relation If I'm running this locally and my path points to S3, this would currently error out because of incorrect FS. I tested this in a scenario that previously didn't

spark git commit: [SPARK-6330] Fix filesystem bug in newParquet relation

2015-03-16 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.3 684ff2476 - 67fa6d1f8 [SPARK-6330] Fix filesystem bug in newParquet relation If I'm running this locally and my path points to S3, this would currently error out because of incorrect FS. I tested this in a scenario that previously

spark git commit: [SPARK-5073] spark.storage.memoryMapThreshold have two default value

2015-01-11 Thread adav
Repository: spark Updated Branches: refs/heads/master 331326090 - 1656aae2b [SPARK-5073] spark.storage.memoryMapThreshold have two default value Because major OS page sizes is about 4KB, the default value of spark.storage.memoryMapThreshold is integrated to 2 * 4096 Author: lewuathe

spark git commit: [Minor] Fix test RetryingBlockFetcherSuite after changed config name

2015-01-09 Thread adav
Repository: spark Updated Branches: refs/heads/master f3da4bd72 - b4034c3f8 [Minor] Fix test RetryingBlockFetcherSuite after changed config name Flakey due to the default retry interval being the same as our test's wait timeout. Author: Aaron Davidson aa...@databricks.com Closes #3972 from

spark git commit: SPARK-4805 [CORE] BlockTransferMessage.toByteArray() trips assertion

2014-12-09 Thread adav
Repository: spark Updated Branches: refs/heads/master 5e4c06f8e - d8f84f26e SPARK-4805 [CORE] BlockTransferMessage.toByteArray() trips assertion Allocate enough room for type byte as well as message, to avoid tripping assertion about capacity of the buffer Author: Sean Owen

spark git commit: SPARK-4805 [CORE] BlockTransferMessage.toByteArray() trips assertion

2014-12-09 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.2 51da2c557 - b0d64e572 SPARK-4805 [CORE] BlockTransferMessage.toByteArray() trips assertion Allocate enough room for type byte as well as message, to avoid tripping assertion about capacity of the buffer Author: Sean Owen

spark git commit: Config updates for the new shuffle transport.

2014-12-09 Thread adav
Repository: spark Updated Branches: refs/heads/master 2b9b72682 - 9bd9334f5 Config updates for the new shuffle transport. Author: Reynold Xin r...@databricks.com Closes #3657 from rxin/conf-update and squashes the following commits: 7370eab [Reynold Xin] Config updates for the new shuffle

spark git commit: Config updates for the new shuffle transport.

2014-12-09 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.2 441ec3451 - 5e5d8f469 Config updates for the new shuffle transport. Author: Reynold Xin r...@databricks.com Closes #3657 from rxin/conf-update and squashes the following commits: 7370eab [Reynold Xin] Config updates for the new

spark git commit: [SPARK-3154][STREAMING] Replace ConcurrentHashMap with mutable.HashMap and remove @volatile from 'stopped'

2014-12-08 Thread adav
Repository: spark Updated Branches: refs/heads/master 51b1fe142 - bcb5cdad6 [SPARK-3154][STREAMING] Replace ConcurrentHashMap with mutable.HashMap and remove @volatile from 'stopped' Since `sequenceNumberToProcessor` and `stopped` are both protected by the lock `sequenceNumberToProcessor`,

spark git commit: [SPARK-4326] fix unidoc

2014-11-13 Thread adav
Repository: spark Updated Branches: refs/heads/master a0fa1ba70 - 4b0c1edfd [SPARK-4326] fix unidoc There are two issues: 1. specifying guava 11.0.2 will cause hashInt not found in unidoc (any reason to force the version here?) 2. unidoc doesn't recognize static class defined in a base

spark git commit: [SPARK-4326] fix unidoc

2014-11-13 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.2 c07592e40 - d993a44de [SPARK-4326] fix unidoc There are two issues: 1. specifying guava 11.0.2 will cause hashInt not found in unidoc (any reason to force the version here?) 2. unidoc doesn't recognize static class defined in a base

spark git commit: [SPARK-4307] Initialize FileDescriptor lazily in FileRegion.

2014-11-11 Thread adav
Repository: spark Updated Branches: refs/heads/master 65083e93d - ef29a9a9a [SPARK-4307] Initialize FileDescriptor lazily in FileRegion. Netty's DefaultFileRegion requires a FileDescriptor in its constructor, which means we need to have a opened file handle. In super large workloads, this

spark git commit: [SPARK-4307] Initialize FileDescriptor lazily in FileRegion.

2014-11-11 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.2 df8242c9b - e9d009dc3 [SPARK-4307] Initialize FileDescriptor lazily in FileRegion. Netty's DefaultFileRegion requires a FileDescriptor in its constructor, which means we need to have a opened file handle. In super large workloads,

spark git commit: SPARK-1830 Deploy failover, Make Persistence engine and LeaderAgent Pluggable

2014-11-11 Thread adav
Repository: spark Updated Branches: refs/heads/master 6e03de304 - deefd9d73 SPARK-1830 Deploy failover, Make Persistence engine and LeaderAgent Pluggable Author: Prashant Sharma prashan...@imaginea.com Closes #771 from ScrapCodes/deploy-failover-pluggable and squashes the following commits:

spark git commit: [SPARK-4264] Completion iterator should only invoke callback once

2014-11-06 Thread adav
Repository: spark Updated Branches: refs/heads/master b41a39e24 - 23eaf0e12 [SPARK-4264] Completion iterator should only invoke callback once Author: Aaron Davidson aa...@databricks.com Closes #3128 from aarondav/compiter and squashes the following commits: 698e4be [Aaron Davidson]

spark git commit: [SPARK-4264] Completion iterator should only invoke callback once

2014-11-06 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.2 01484455c - aaaeaf939 [SPARK-4264] Completion iterator should only invoke callback once Author: Aaron Davidson aa...@databricks.com Closes #3128 from aarondav/compiter and squashes the following commits: 698e4be [Aaron Davidson]

git commit: [SPARK-4163][Core] Add a backward compatibility test for FetchFailed

2014-11-03 Thread adav
Repository: spark Updated Branches: refs/heads/master 1a9c6cdda - 9bdc8412a [SPARK-4163][Core] Add a backward compatibility test for FetchFailed /cc aarondav Author: zsxwing zsxw...@gmail.com Closes #3086 from zsxwing/SPARK-4163-back-comp and squashes the following commits: 21cb2a8

git commit: [SPARK-4163][Core][WebUI] Send the fetch failure message back to Web UI

2014-11-02 Thread adav
Repository: spark Updated Branches: refs/heads/master 001acc446 - 76386e1a2 [SPARK-4163][Core][WebUI] Send the fetch failure message back to Web UI This is a PR to send the fetch failure message back to Web UI. Before:

git commit: HOTFIX: Clean up build in network module.

2014-10-30 Thread adav
Repository: spark Updated Branches: refs/heads/master 26d31d15f - 0734d0932 HOTFIX: Clean up build in network module. This is currently breaking the package build for some people (including me). This patch does some general clean-up which also fixes the current issue. - Uses consistent

[1/2] [SPARK-4084] Reuse sort key in Sorter

2014-10-28 Thread adav
Repository: spark Updated Branches: refs/heads/master 4b55482ab - 84e5da87e http://git-wip-us.apache.org/repos/asf/spark/blob/84e5da87/core/src/test/scala/org/apache/spark/util/collection/SorterSuite.scala -- diff --git

[2/2] git commit: [SPARK-4084] Reuse sort key in Sorter

2014-10-28 Thread adav
[SPARK-4084] Reuse sort key in Sorter Sorter uses generic-typed key for sorting. When data is large, it creates lots of key objects, which is not efficient. We should reuse the key in Sorter for memory efficiency. This change is part of the petabyte sort implementation from rxin . The

git commit: [SPARK-4008] Fix kryo with fold in KryoSerializerSuite

2014-10-28 Thread adav
Repository: spark Updated Branches: refs/heads/master 84e5da87e - 1536d7033 [SPARK-4008] Fix kryo with fold in KryoSerializerSuite `zeroValue` will be serialized by `spark.closure.serializer` but `spark.closure.serializer` only supports the default Java serializer. So it must not be

git commit: Add more debug message for ManagedBuffer

2014-09-29 Thread adav
Repository: spark Updated Branches: refs/heads/master dab1b0ae2 - e43c72fe0 Add more debug message for ManagedBuffer This is to help debug the error reported at http://apache-spark-user-list.1001560.n3.nabble.com/SQL-queries-fail-in-1-2-0-SNAPSHOT-td15327.html Author: Reynold Xin

git commit: [Build] suppress curl/wget progress bars

2014-09-05 Thread adav
Repository: spark Updated Branches: refs/heads/master ba5bcadde - 19f61c165 [Build] suppress curl/wget progress bars In the Jenkins console output, `curl` gives us mountains of `#` symbols as it tries to show its download progress. ![noise from curl in Jenkins

git commit: [SPARK-2936] Migrate Netty network module from Java to Scala

2014-08-10 Thread adav
Repository: spark Updated Branches: refs/heads/master b715aa0c8 - ba28a8fcb [SPARK-2936] Migrate Netty network module from Java to Scala The Netty network module was originally written when Scala 2.9.x had a bug that prevents a pure Scala implementation, and a subset of the files were done

git commit: [Spark 2557] fix LOCAL_N_REGEX in createTaskScheduler and make local-n and local-n-failures consistent

2014-08-01 Thread adav
Repository: spark Updated Branches: refs/heads/master f1957e116 - 284771efb [Spark 2557] fix LOCAL_N_REGEX in createTaskScheduler and make local-n and local-n-failures consistent [SPARK-2557](https://issues.apache.org/jira/browse/SPARK-2557) Author: Ye Xianjin advance...@gmail.com Closes

git commit: [SPARK-2764] Simplify daemon.py process structure

2014-08-01 Thread adav
Repository: spark Updated Branches: refs/heads/master a38d3c9ef - e8e0fd691 [SPARK-2764] Simplify daemon.py process structure Curently, daemon.py forks a pool of numProcessors subprocesses, and those processes fork themselves again to create the actual Python worker processes that handle

git commit: SPARK-2564. ShuffleReadMetrics.totalBlocksRead is redundant

2014-07-20 Thread adav
Repository: spark Updated Branches: refs/heads/master 1b10b8114 - 9564f8548 SPARK-2564. ShuffleReadMetrics.totalBlocksRead is redundant Author: Sandy Ryza sa...@cloudera.com Closes #1474 from sryza/sandy-spark-2564 and squashes the following commits: 35b8388 [Sandy Ryza] Fix compile error

git commit: [SPARK-2485][SQL] Lock usage of hive client.

2014-07-15 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 0e2727959 - 53a6399e5 [SPARK-2485][SQL] Lock usage of hive client. Author: Michael Armbrust mich...@databricks.com Closes #1412 from marmbrus/lockHiveClient and squashes the following commits: 4bc9d5a [Michael Armbrust]

git commit: [SPARK-2485][SQL] Lock usage of hive client.

2014-07-15 Thread adav
Repository: spark Updated Branches: refs/heads/master c6d75745d - c7c7ac833 [SPARK-2485][SQL] Lock usage of hive client. Author: Michael Armbrust mich...@databricks.com Closes #1412 from marmbrus/lockHiveClient and squashes the following commits: 4bc9d5a [Michael Armbrust] protected[hive]

git commit: Reformat multi-line closure argument.

2014-07-15 Thread adav
Repository: spark Updated Branches: refs/heads/master 04b01bb10 - cb09e93c1 Reformat multi-line closure argument. Author: William Benton wi...@redhat.com Closes #1419 from willb/reformat-2486 and squashes the following commits: 2676231 [William Benton] Reformat multi-line closure argument.

git commit: [SPARK-2403] Catch all errors during serialization in DAGScheduler

2014-07-08 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 4bf8ddaee - 3bd32f023 [SPARK-2403] Catch all errors during serialization in DAGScheduler https://issues.apache.org/jira/browse/SPARK-2403 Spark hangs for us whenever we forget to register a class with Kryo. This should be a simple

git commit: [SPARK-2403] Catch all errors during serialization in DAGScheduler

2014-07-08 Thread adav
Repository: spark Updated Branches: refs/heads/master cc3e0a14d - c8a2313cd [SPARK-2403] Catch all errors during serialization in DAGScheduler https://issues.apache.org/jira/browse/SPARK-2403 Spark hangs for us whenever we forget to register a class with Kryo. This should be a simple fix

git commit: [SPARK-2324] SparkContext should not exit directly when spark.local.dir is a list of multiple paths and one of them has error

2014-07-03 Thread adav
Repository: spark Updated Branches: refs/heads/master bc7041a42 - 3bbeca648 [SPARK-2324] SparkContext should not exit directly when spark.local.dir is a list of multiple paths and one of them has error The spark.local.dir is configured as a list of multiple paths as follows

git commit: [SPARK] Fix NPE for ExternalAppendOnlyMap

2014-07-03 Thread adav
Repository: spark Updated Branches: refs/heads/master 3bbeca648 - c48053773 [SPARK] Fix NPE for ExternalAppendOnlyMap It did not handle null keys very gracefully before. Author: Andrew Or andrewo...@gmail.com Closes #1288 from andrewor14/fix-external and squashes the following commits:

git commit: [SPARK] Fix NPE for ExternalAppendOnlyMap

2014-07-03 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 87b74a9bf - fdee6ee06 [SPARK] Fix NPE for ExternalAppendOnlyMap It did not handle null keys very gracefully before. Author: Andrew Or andrewo...@gmail.com Closes #1288 from andrewor14/fix-external and squashes the following commits:

git commit: [SPARK-1097] Workaround Hadoop conf ConcurrentModification issue

2014-07-03 Thread adav
Repository: spark Updated Branches: refs/heads/master fdc4c112e - 5fa0a0576 [SPARK-1097] Workaround Hadoop conf ConcurrentModification issue Workaround Hadoop conf ConcurrentModification issue Author: Raymond Liu raymond@intel.com Closes #1273 from colorant/hadoopRDD and squashes the

git commit: [SPARK-1394] Remove SIGCHLD handler in worker subprocess

2014-06-28 Thread adav
Repository: spark Updated Branches: refs/heads/master b8f2e13ae - 3c104c79d [SPARK-1394] Remove SIGCHLD handler in worker subprocess It should not be the responsibility of the worker subprocess, which does not intentionally fork, to try and cleanup child processes. Doing so is complex and

git commit: [SPARK-1112, 2156] (1.0 edition) Use correct akka frame size and overhead amounts.

2014-06-22 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 64316af5a - 67bffd3c7 [SPARK-1112, 2156] (1.0 edition) Use correct akka frame size and overhead amounts. SPARK-1112: This is a more conservative version of #1132 that doesn't change around the actor system initialization on the

git commit: [SPARK-937] adding EXITED executor state and not relaunching cleanly exited executors

2014-06-15 Thread adav
Repository: spark Updated Branches: refs/heads/master 269fc62b2 - ca5d9d43b [SPARK-937] adding EXITED executor state and not relaunching cleanly exited executors There seems to be 2 issues. 1. When job is done, driver asks executor to shutdown. However, this clean exit was assigned FAILED

git commit: [SPARK-937] adding EXITED executor state and not relaunching cleanly exited executors

2014-06-15 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 868cf421e - 609e5ff20 [SPARK-937] adding EXITED executor state and not relaunching cleanly exited executors There seems to be 2 issues. 1. When job is done, driver asks executor to shutdown. However, this clean exit was assigned

svn commit: r1600800 [2/2] - in /spark: ./ site/ site/news/ site/releases/

2014-06-05 Thread adav
Modified: spark/site/releases/spark-release-0-8-1.html URL: http://svn.apache.org/viewvc/spark/site/releases/spark-release-0-8-1.html?rev=1600800r1=1600799r2=1600800view=diff == ---

git commit: Optionally include Hive as a dependency of the REPL.

2014-05-31 Thread adav
Repository: spark Updated Branches: refs/heads/master 3ce81494c - 7463cd248 Optionally include Hive as a dependency of the REPL. Due to the way spark-shell launches from an assembly jar, I don't think this change will affect anyone who isn't trying to launch the shell directly from sbt.

git commit: [SPARK-1901] worker should make sure executor has exited before updating executor's info

2014-05-30 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 80721fb45 - 1696a4470 [SPARK-1901] worker should make sure executor has exited before updating executor's info https://issues.apache.org/jira/browse/SPARK-1901 Author: Zhen Peng zhenpen...@baidu.com Closes #854 from

git commit: SPARK-1932: Fix race conditions in onReceiveCallback and cachedPeers

2014-05-27 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 fcb375026 - 214f90ee7 SPARK-1932: Fix race conditions in onReceiveCallback and cachedPeers `var cachedPeers: Seq[BlockManagerId] = null` is used in `def replicate(blockId: BlockId, data: ByteBuffer, level: StorageLevel)` without

git commit: bugfix worker DriverStateChanged state should match DriverState.FAILED

2014-05-27 Thread adav
Repository: spark Updated Branches: refs/heads/master 549830b0d - 95e4c9c6f bugfix worker DriverStateChanged state should match DriverState.FAILED bugfix worker DriverStateChanged state should match DriverState.FAILED Author: lianhuiwang lianhuiwan...@gmail.com Closes #864 from

git commit: bugfix worker DriverStateChanged state should match DriverState.FAILED

2014-05-27 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 214f90ee7 - 30be37ca7 bugfix worker DriverStateChanged state should match DriverState.FAILED bugfix worker DriverStateChanged state should match DriverState.FAILED Author: lianhuiwang lianhuiwan...@gmail.com Closes #864 from

git commit: [SPARK-1688] Propagate PySpark worker stderr to driver

2014-05-15 Thread adav
Repository: spark Updated Branches: refs/heads/master d00981a95 - 520087224 [SPARK-1688] Propagate PySpark worker stderr to driver When at least one of the following conditions is true, PySpark cannot be loaded: 1. PYTHONPATH is not set 2. PYTHONPATH does not contain the python directory (or

git commit: SPARK-1801. expose InterruptibleIterator and TaskKilledException in deve...

2014-05-14 Thread adav
Repository: spark Updated Branches: refs/heads/master 6ce088444 - b22952fa1 SPARK-1801. expose InterruptibleIterator and TaskKilledException in deve... ...loper api Author: Koert Kuipers ko...@tresata.com Closes #764 from koertkuipers/feat-rdd-developerapi and squashes the following

git commit: SPARK-1801. expose InterruptibleIterator and TaskKilledException in deve...

2014-05-14 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 f66f76648 - 7da80a318 SPARK-1801. expose InterruptibleIterator and TaskKilledException in deve... ...loper api Author: Koert Kuipers ko...@tresata.com Closes #764 from koertkuipers/feat-rdd-developerapi and squashes the following

git commit: [SPARK-1769] Executor loss causes NPE race condition

2014-05-14 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 b3d987893 - 69ec3149f [SPARK-1769] Executor loss causes NPE race condition This PR replaces the Schedulable data structures in Pool.scala with thread-safe ones from java. Note that Scala's `with SynchronizedBuffer` trait is soon to be

git commit: SPARK-1770: Revert accidental(?) fix

2014-05-14 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 80f292a21 - 8202276c9 SPARK-1770: Revert accidental(?) fix Looks like this change was accidentally committed here: https://github.com/apache/spark/commit/06b15baab25951d124bbe6b64906f4139e037deb but the change does not show up in the

git commit: [SPARK-1745] Move interrupted flag from TaskContext constructor (minor)

2014-05-10 Thread adav
Repository: spark Updated Branches: refs/heads/master 44dd57fb6 - c3f8b78c2 [SPARK-1745] Move interrupted flag from TaskContext constructor (minor) It makes little sense to start a TaskContext that is interrupted. Indeed, I searched for all use cases of it and didn't find a single instance

git commit: [SPARK-1688] Propagate PySpark worker stderr to driver

2014-05-10 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 0759ee790 - 82c8e89c9 [SPARK-1688] Propagate PySpark worker stderr to driver When at least one of the following conditions is true, PySpark cannot be loaded: 1. PYTHONPATH is not set 2. PYTHONPATH does not contain the python directory

git commit: SPARK-1686: keep schedule() calling in the main thread

2014-05-10 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 8202276c9 - adf8cdd0b SPARK-1686: keep schedule() calling in the main thread https://issues.apache.org/jira/browse/SPARK-1686 moved from original JIRA (by @markhamstra): In deploy.master.Master, the completeRecovery method is the

git commit: SPARK-1700: Close socket file descriptors on task completion

2014-05-03 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 a314342da - d2cbd3d76 SPARK-1700: Close socket file descriptors on task completion This will ensure that sockets do not build up over the course of a job, and that cancellation successfully cleans up sockets. Tested in standalone

git commit: [WIP] SPARK-1676: Cache Hadoop UGIs by default to prevent FileSystem leak

2014-05-03 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 34f22bcc4 - 0441515f2 [WIP] SPARK-1676: Cache Hadoop UGIs by default to prevent FileSystem leak Move the doAs in Executor higher up so that we only have 1 ugi and aren't leaking filesystems. Fix spark on yarn to work when the cluster

git commit: [WIP] SPARK-1676: Cache Hadoop UGIs by default to prevent FileSystem leak

2014-05-03 Thread adav
Repository: spark Updated Branches: refs/heads/master 9347565f4 - 3d0a02dff [WIP] SPARK-1676: Cache Hadoop UGIs by default to prevent FileSystem leak Move the doAs in Executor higher up so that we only have 1 ugi and aren't leaking filesystems. Fix spark on yarn to work when the cluster is

git commit: [WIP] SPARK-1676: Cache Hadoop UGIs by default to prevent FileSystem leak

2014-05-03 Thread adav
Repository: spark Updated Branches: refs/heads/branch-0.9 54c3b7e3b - 45561cd9f [WIP] SPARK-1676: Cache Hadoop UGIs by default to prevent FileSystem leak Move the doAs in Executor higher up so that we only have 1 ugi and aren't leaking filesystems. Fix spark on yarn to work when the cluster

git commit: SPARK-1587 Fix thread leak

2014-04-24 Thread adav
Repository: spark Updated Branches: refs/heads/master bb68f4774 - dd681f502 SPARK-1587 Fix thread leak mvn test fails (intermittently) due to thread leak - since scalatest runs all tests in same vm. Author: Mridul Muralidharan mridul...@apache.org Closes #504 from

git commit: SPARK-1587 Fix thread leak

2014-04-24 Thread adav
Repository: spark Updated Branches: refs/heads/branch-1.0 e8907718a - 8684a15e5 SPARK-1587 Fix thread leak mvn test fails (intermittently) due to thread leak - since scalatest runs all tests in same vm. Author: Mridul Muralidharan mridul...@apache.org Closes #504 from

git commit: [SPARK-1385] Use existing code for JSON de/serialization of BlockId

2014-04-02 Thread adav
Repository: spark Updated Branches: refs/heads/master 11973a7bd - de8eefa80 [SPARK-1385] Use existing code for JSON de/serialization of BlockId `BlockId.scala` offers a way to reconstruct a BlockId from a string through regex matching. `util/JsonProtocol.scala` duplicates this functionality

git commit: SPARK-1099:Spark's local mode should probably respect spark.cores.max by default

2014-03-19 Thread adav
Repository: spark Updated Branches: refs/heads/master 67fa71cba - 16789317a SPARK-1099:Spark's local mode should probably respect spark.cores.max by default This is for JIRA:https://spark-project.atlassian.net/browse/SPARK-1099 And this is what I do in this patch (also commented in the JIRA)

git commit: SPARK-1160: Deprecate toArray in RDD

2014-03-12 Thread adav
Repository: spark Updated Branches: refs/heads/master b8afe3052 - 9032f7c0d SPARK-1160: Deprecate toArray in RDD https://spark-project.atlassian.net/browse/SPARK-1160 reported by @mateiz: It's redundant with collect() and the name doesn't make sense in Java, where we return a List (we can't