[GitHub] spark pull request: Branch 1.6

naujlove Sat, 27 Feb 2016 06:43:59 -0800

GitHub user naujlove opened a pull request:

    https://github.com/apache/spark/pull/11413


    Branch 1.6

    ## What changes were proposed in this pull request?
    
    (Please fill in changes proposed in this fix)
    
    
    ## How was this patch tested?
    
    (Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
    
    
    (If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/spark branch-1.6

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11413.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11413
    
----
commit b1d5a7859546eabdc7cf070b3e78d91389a8fbd6
Author: Timothy Hunter <[email protected]>
Date:   2015-12-09T02:40:21Z

    [SPARK-8517][ML][DOC] Reorganizes the spark.ml user guide
    
    This PR moves pieces of the spark.ml user guide to reflect suggestions in 
SPARK-8517. It does not introduce new content, as requested.
    
    <img width="192" alt="screen shot 2015-12-08 at 11 36 00 am" 
src="https://cloud.githubusercontent.com/assets/7594753/11666166/e82b84f2-9d9f-11e5-8904-e215424d8444.png";>
    
    Author: Timothy Hunter <[email protected]>
    
    Closes #10207 from thunterdb/spark-8517.
    
    (cherry picked from commit 765c67f5f2e0b1367e37883f662d313661e3a0d9)
    Signed-off-by: Joseph K. Bradley <[email protected]>

commit 9e82273afc68947dc2a08315e0d42cfcedacaa2a
Author: Dominik Dahlem <[email protected]>
Date:   2015-12-09T02:54:10Z

    [SPARK-11343][ML] Documentation of float and double prediction/label 
columns in RegressionEvaluator
    
    felixcheung , mengxr
    
    Just added a message to require()
    
    Author: Dominik Dahlem <[email protected]>
    
    Closes #9598 from 
dahlem/ddahlem_regression_evaluator_double_predictions_message_04112015.
    
    (cherry picked from commit a0046e379bee0852c39ece4ea719cde70d350b0e)
    Signed-off-by: Joseph K. Bradley <[email protected]>

commit 0be792aad5d01432e989a03969541f41a45281e2
Author: Fei Wang <[email protected]>
Date:   2015-12-09T05:32:31Z

    [SPARK-12222] [CORE] Deserialize RoaringBitmap using Kryo serializer throw 
Buffer underflow exception
    
    Jira: https://issues.apache.org/jira/browse/SPARK-12222
    
    Deserialize RoaringBitmap using Kryo serializer throw Buffer underflow 
exception:
    ```
    com.esotericsoftware.kryo.KryoException: Buffer underflow.
        at com.esotericsoftware.kryo.io.Input.require(Input.java:156)
        at com.esotericsoftware.kryo.io.Input.skip(Input.java:131)
        at com.esotericsoftware.kryo.io.Input.skip(Input.java:264)
    ```
    
    This is caused by a bug of kryo's `Input.skip(long 
count)`(https://github.com/EsotericSoftware/kryo/issues/119) and we call this 
method in `KryoInputDataInputBridge`.
    
    Instead of upgrade kryo's version, this pr bypass the  kryo's 
`Input.skip(long count)` by directly call another `skip` method in kryo's 
Input.java(https://github.com/EsotericSoftware/kryo/blob/kryo-2.21/src/com/esotericsoftware/kryo/io/Input.java#L124),
 i.e. write the bug-fixed version of `Input.skip(long count)` in 
KryoInputDataInputBridge's `skipBytes` method.
    
    more detail link to 
https://github.com/apache/spark/pull/9748#issuecomment-162860246
    
    Author: Fei Wang <[email protected]>
    
    Closes #10213 from scwf/patch-1.
    
    (cherry picked from commit 3934562d34bbe08d91c54b4bbee27870e93d7571)
    Signed-off-by: Davies Liu <[email protected]>

commit b5a76b4a40e043c5384be7c620e7ca257b7ef2cd
Author: uncleGen <[email protected]>
Date:   2015-12-09T15:09:40Z

    [SPARK-12031][CORE][BUG] Integer overflow when do sampling
    
    Author: uncleGen <[email protected]>
    
    Closes #10023 from uncleGen/1.6-bugfix.
    
    (cherry picked from commit a113216865fd45ea39ae8f104e784af2cf667dcf)
    Signed-off-by: Sean Owen <[email protected]>

commit acd462420ab5565ba5bf098f399fb355da3d6139
Author: Holden Karau <[email protected]>
Date:   2015-12-09T16:45:13Z

    [SPARK-10299][ML] word2vec should allow users to specify the window size
    
    Currently word2vec has the window hard coded at 5, some users may want 
different sizes (for example if using on n-gram input or similar). User request 
comes from 
http://stackoverflow.com/questions/32231975/spark-word2vec-window-size .
    
    Author: Holden Karau <[email protected]>
    Author: Holden Karau <[email protected]>
    
    Closes #8513 from 
holdenk/SPARK-10299-word2vec-should-allow-users-to-specify-the-window-size.
    
    (cherry picked from commit 22b9a8740d51289434553d19b6b1ac34aecdc09a)
    Signed-off-by: Sean Owen <[email protected]>

commit 05e441e121a86e0c105ad25010e4678f2f9e73e3
Author: Josh Rosen <[email protected]>
Date:   2015-12-09T19:39:59Z

    [SPARK-12165][SPARK-12189] Fix bugs in eviction of storage memory by 
execution
    
    This patch fixes a bug in the eviction of storage memory by execution.
    
    ## The bug:
    
    In general, execution should be able to evict storage memory when the total 
storage memory usage is greater than `maxMemory * 
spark.memory.storageFraction`. Due to a bug, however, Spark might wind up 
evicting no storage memory in certain cases where the storage memory usage was 
between `maxMemory * spark.memory.storageFraction` and `maxMemory`. For 
example, here is a regression test which illustrates the bug:
    
    ```scala
        val maxMemory = 1000L
        val taskAttemptId = 0L
        val (mm, ms) = makeThings(maxMemory)
        // Since we used the default storage fraction (0.5), we should be able 
to allocate 500 bytes
        // of storage memory which are immune to eviction by execution memory 
pressure.
    
        // Acquire enough storage memory to exceed the storage region size
        assert(mm.acquireStorageMemory(dummyBlock, 750L, evictedBlocks))
        assertEvictBlocksToFreeSpaceNotCalled(ms)
        assert(mm.executionMemoryUsed === 0L)
        assert(mm.storageMemoryUsed === 750L)
    
        // At this point, storage is using 250 more bytes of memory than it is 
guaranteed, so execution
        // should be able to reclaim up to 250 bytes of storage memory.
        // Therefore, execution should now be able to require up to 500 bytes 
of memory:
        assert(mm.acquireExecutionMemory(500L, taskAttemptId, 
MemoryMode.ON_HEAP) === 500L) // <--- fails by only returning 250L
        assert(mm.storageMemoryUsed === 500L)
        assert(mm.executionMemoryUsed === 500L)
        assertEvictBlocksToFreeSpaceCalled(ms, 250L)
    ```
    
    The problem relates to the control flow / interaction between 
`StorageMemoryPool.shrinkPoolToReclaimSpace()` and 
`MemoryStore.ensureFreeSpace()`. While trying to allocate the 500 bytes of 
execution memory, the `UnifiedMemoryManager` discovers that it will need to 
reclaim 250 bytes of memory from storage, so it calls 
`StorageMemoryPool.shrinkPoolToReclaimSpace(250L)`. This method, in turn, calls 
`MemoryStore.ensureFreeSpace(250L)`. However, `ensureFreeSpace()` first checks 
whether the requested space is less than `maxStorageMemory - 
storageMemoryUsed`, which will be true if there is any free execution memory 
because it turns out that `MemoryStore.maxStorageMemory = (maxMemory - 
onHeapExecutionMemoryPool.memoryUsed)` when the `UnifiedMemoryManager` is used.
    
    The control flow here is somewhat confusing (it grew to be messy / 
confusing over time / as a result of the merging / refactoring of several 
components). In the pre-Spark 1.6 code, `ensureFreeSpace` was called directly 
by the `MemoryStore` itself, whereas in 1.6 it's involved in a confusing 
control flow where `MemoryStore` calls `MemoryManager.acquireStorageMemory`, 
which then calls back into `MemoryStore.ensureFreeSpace`, which, in turn, calls 
`MemoryManager.freeStorageMemory`.
    
    ## The solution:
    
    The solution implemented in this patch is to remove the confusing circular 
control flow between `MemoryManager` and `MemoryStore`, making the storage 
memory acquisition process much more linear / straightforward. The key changes:
    
    - Remove a layer of inheritance which made the memory manager code harder 
to understand (53841174760a24a0df3eb1562af1f33dbe340eb9).
    - Move some bounds checks earlier in the call chain 
(13ba7ada77f87ef1ec362aec35c89a924e6987cb).
    - Refactor `ensureFreeSpace()` so that the part which evicts blocks can be 
called independently from the part which checks whether there is enough free 
space to avoid eviction (7c68ca09cb1b12f157400866983f753ac863380e).
    - Realize that this lets us remove a layer of overloads from 
`ensureFreeSpace` (eec4f6c87423d5e482b710e098486b3bbc4daf06).
    - Realize that `ensureFreeSpace()` can simply be replaced with an 
`evictBlocksToFreeSpace()` method which is called [after we've already figured 
out](https://github.com/apache/spark/blob/2dc842aea82c8895125d46a00aa43dfb0d121de9/core/src/main/scala/org/apache/spark/memory/StorageMemoryPool.scala#L88)
 how much memory needs to be reclaimed via eviction; 
(2dc842aea82c8895125d46a00aa43dfb0d121de9).
    
    Along the way, I fixed some problems with the mocks in 
`MemoryManagerSuite`: the old mocks would 
[unconditionally](https://github.com/apache/spark/blob/80a824d36eec9d9a9f092ee1741453851218ec73/core/src/test/scala/org/apache/spark/memory/MemoryManagerSuite.scala#L84)
 report that a block had been evicted even if there was enough space in the 
storage pool such that eviction would be avoided.
    
    I also fixed a problem where `StorageMemoryPool._memoryUsed` might become 
negative due to freed memory being double-counted when excution evicts storage. 
The problem was that `StorageMemoryPoolshrinkPoolToFreeSpace` would [decrement 
`_memoryUsed`](https://github.com/apache/spark/commit/7c68ca09cb1b12f157400866983f753ac863380e#diff-935c68a9803be144ed7bafdd2f756a0fL133)
 even though `StorageMemoryPool.freeMemory` had already decremented it as each 
evicted block was freed. See SPARK-12189 for details.
    
    Author: Josh Rosen <[email protected]>
    Author: Andrew Or <[email protected]>
    
    Closes #10170 from JoshRosen/SPARK-12165.
    
    (cherry picked from commit aec5ea000ebb8921f42f006b694ef26f5df67d83)
    Signed-off-by: Andrew Or <[email protected]>

commit ee0a6e72234e4f672a2939b794c904026f696398
Author: Sean Owen <[email protected]>
Date:   2015-12-09T19:47:38Z

    [SPARK-11824][WEBUI] WebUI does not render descriptions with 'bad' HTML, 
throws console error
    
    Don't warn when description isn't valid HTML since it may properly be like 
"SELECT ... where foo <= 1"
    
    The tests for this code indicate that it's normal to handle strings like 
this that don't contain HTML as a string rather than markup. Hence logging 
every such instance as a warning is too noisy since it's not a problem. this is 
an issue for stages whose name contain SQL like the above
    
    CC tdas as author of this bit of code
    
    Author: Sean Owen <[email protected]>
    
    Closes #10159 from srowen/SPARK-11824.
    
    (cherry picked from commit 1eb7c22ce72a1b82ed194a51bbcf0da9c771605a)
    Signed-off-by: Sean Owen <[email protected]>

commit bfb4201395c6a1905c6eb46de4ea3eefe8d17309
Author: Xusen Yin <[email protected]>
Date:   2015-12-09T20:00:48Z

    [SPARK-11551][DOC] Replace example code in ml-features.md using 
include_example
    
    PR on behalf of somideshmukh, thanks!
    
    Author: Xusen Yin <[email protected]>
    Author: somideshmukh <[email protected]>
    
    Closes #10219 from yinxusen/SPARK-11551.
    
    (cherry picked from commit 051c6a066f7b5fcc7472412144c15b50a5319bd5)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 9bc6a27fdc5db3815958c721737a195af93f3757
Author: Andrew Ray <[email protected]>
Date:   2015-12-10T01:16:01Z

    [SPARK-12211][DOC][GRAPHX] Fix version number in graphx doc for migration 
from 1.1
    
    Migration from 1.1 section added to the GraphX doc in 1.2.0 (see 
https://spark.apache.org/docs/1.2.0/graphx-programming-guide.html#migrating-from-spark-11)
 uses \{{site.SPARK_VERSION}} as the version where changes were introduced, it 
should be just 1.2.
    
    Author: Andrew Ray <[email protected]>
    
    Closes #10206 from aray/graphx-doc-1.1-migration.
    
    (cherry picked from commit 7a8e587dc04c2fabc875d1754eae7f85b4fba6ba)
    Signed-off-by: Joseph K. Bradley <[email protected]>

commit d86a88da677041d3c4ab484ed6f4f152674091f0
Author: Andrew Or <[email protected]>
Date:   2015-12-10T01:24:04Z

    [SPARK-12165][ADDENDUM] Fix outdated comments on unroll test
    
    JoshRosen
    
    Author: Andrew Or <[email protected]>
    
    Closes #10229 from andrewor14/unroll-test-comments.
    
    (cherry picked from commit 8770bd1213f9b1051dabde9c5424ae7b32143a44)
    Signed-off-by: Josh Rosen <[email protected]>

commit 9fe8dc916e8a30914199b1fbb8c3765ba742559a
Author: Yin Huai <[email protected]>
Date:   2015-12-10T02:09:36Z

    [SPARK-11678][SQL][DOCS] Document basePath in the programming guide.
    
    This PR adds document for `basePath`, which is a new parameter used by 
`HadoopFsRelation`.
    
    The compiled doc is shown below.
    
![image](https://cloud.githubusercontent.com/assets/2072857/11673132/1ba01192-9dcb-11e5-98d9-ac0b4e92e98c.png)
    
    JIRA: https://issues.apache.org/jira/browse/SPARK-11678
    
    Author: Yin Huai <[email protected]>
    
    Closes #10211 from yhuai/basePathDoc.
    
    (cherry picked from commit ac8cdf1cdc148bd21290ecf4d4f9874f8c87cc14)
    Signed-off-by: Yin Huai <[email protected]>

commit 699f497cf7ceefbaed689b6f3515f8a2ebc636ca
Author: Mark Grover <[email protected]>
Date:   2015-12-10T02:37:35Z

    [SPARK-11796] Fix httpclient and httpcore depedency issues related to 
docker-client
    
    This commit fixes dependency issues which prevented the Docker-based JDBC 
integration tests from running in the Maven build.
    
    Author: Mark Grover <[email protected]>
    
    Closes #9876 from markgrover/master_docker.
    
    (cherry picked from commit 2166c2a75083c2262e071a652dd52b1a33348b6e)
    Signed-off-by: Josh Rosen <[email protected]>

commit f6d8661738b5a4b139c4800d5c4e9f0094068451
Author: Tathagata Das <[email protected]>
Date:   2015-12-10T04:47:15Z

    [SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to 
mapWithState and change tracking function signature
    
    SPARK-12244:
    
    Based on feedback from early users and personal experience attempting to 
explain it, the name trackStateByKey had two problem.
    "trackState" is a completely new term which really does not give any 
intuition on what the operation is
    the resultant data stream of objects returned by the function is called in 
docs as the "emitted" data for the lack of a better.
    "mapWithState" makes sense because the API is like a mapping function like 
(Key, Value) => T with State as an additional parameter. The resultant data 
stream is "mapped data". So both problems are solved.
    
    SPARK-12245:
    
    From initial experiences, not having the key in the function makes it hard 
to return mapped stuff, as the whole information of the records is not there. 
Basically the user is restricted to doing something like mapValue() instead of 
map(). So adding the key as a parameter.
    
    Author: Tathagata Das <[email protected]>
    
    Closes #10224 from tdas/rename.

commit b5e5812f9ef8aa8d133a75bb8aa8dd8680130efa
Author: bomeng <[email protected]>
Date:   2015-12-10T12:53:53Z

    [SPARK-12136][STREAMING] rddToFileName does not properly handle prefix and 
suffix parameters
    
    The original code does not properly handle the cases where the prefix is 
null, but suffix is not null - the suffix should be used but is not.
    
    The fix is using StringBuilder to construct the proper file name.
    
    Author: bomeng <[email protected]>
    Author: Bo Meng <[email protected]>
    
    Closes #10185 from bomeng/SPARK-12136.
    
    (cherry picked from commit e29704f90dfe67d9e276d242699ac0a00f64fb91)
    Signed-off-by: Sean Owen <[email protected]>

commit f939c71b187cff3a5bb63aa3659429b6efb0626d
Author: Reynold Xin <[email protected]>
Date:   2015-12-10T14:23:10Z

    [SPARK-12242][SQL] Add DataFrame.transform method
    
    Author: Reynold Xin <[email protected]>
    
    Closes #10226 from rxin/df-transform.
    
    (cherry picked from commit 76540b6df5370b463277d3498097b2cc2d2e97a8)
    Signed-off-by: Reynold Xin <[email protected]>

commit b7b9f772751dc4ea7eb28a2bdb897a04e563fafa
Author: Yanbo Liang <[email protected]>
Date:   2015-12-10T17:44:53Z

    [SPARK-12198][SPARKR] SparkR support read.parquet and deprecate parquetFile
    
    SparkR support ```read.parquet``` and deprecate ```parquetFile```. This 
change is similar with #10145 for ```jsonFile```.
    
    Author: Yanbo Liang <[email protected]>
    
    Closes #10191 from yanboliang/spark-12198.
    
    (cherry picked from commit eeb58722ad73441eeb5f35f864be3c5392cfd426)
    Signed-off-by: Shivaram Venkataraman <[email protected]>

commit e65c88536ad1843a45e0fe3cf1edadfdf4ad3460
Author: Yuhao Yang <[email protected]>
Date:   2015-12-10T18:15:50Z

    [SPARK-11602][MLLIB] Refine visibility for 1.6 scala API audit
    
    jira: https://issues.apache.org/jira/browse/SPARK-11602
    
    Made a pass on the API change of 1.6. Open the PR for efficient discussion.
    
    Author: Yuhao Yang <[email protected]>
    
    Closes #9939 from hhbyyh/auditScala.
    
    (cherry picked from commit 9fba9c8004d2b97549e5456fa7918965bec27336)
    Signed-off-by: Joseph K. Bradley <[email protected]>

commit 93ef2463820928f434f9fb1542bc30cfb1cec9aa
Author: Yanbo Liang <[email protected]>
Date:   2015-12-10T18:18:58Z

    [SPARK-12234][SPARKR] Fix ```subset``` function error when only set 
```select``` argument
    
    Fix ```subset``` function error when only set ```select``` argument. Please 
refer to the [JIRA](https://issues.apache.org/jira/browse/SPARK-12234) about 
the error and how to reproduce it.
    
    cc sun-rui felixcheung shivaram
    
    Author: Yanbo Liang <[email protected]>
    
    Closes #10217 from yanboliang/spark-12234.
    
    (cherry picked from commit d9d354ed40eec56b3f03d32f4e2629d367b1bf02)
    Signed-off-by: Shivaram Venkataraman <[email protected]>

commit e541f703d72d3dd3ad96db55650c5b1a1a5a38e2
Author: Cheng Lian <[email protected]>
Date:   2015-12-10T18:19:44Z

    [SPARK-12012][SQL][BRANCH-1.6] Show more comprehensive PhysicalRDD metadata 
when visualizing SQL query plan
    
    This PR backports PR #10004 to branch-1.6
    
    It adds a private[sql] method metadata to SparkPlan, which can be used to 
describe detail information about a physical plan during visualization. 
Specifically, this PR uses this method to provide details of PhysicalRDDs 
translated from a data source relation.
    
    Author: Cheng Lian <[email protected]>
    
    Closes #10250 from liancheng/spark-12012.for-1.6.

commit 594fafc6122ad9c6b24bdb4a434d97158c7745f3
Author: Yin Huai <[email protected]>
Date:   2015-12-10T20:03:29Z

    [SPARK-12250][SQL] Allow users to define a UDAF without providing details 
of its inputSchema
    
    https://issues.apache.org/jira/browse/SPARK-12250
    
    Author: Yin Huai <[email protected]>
    
    Closes #10236 from yhuai/SPARK-12250.
    
    (cherry picked from commit bc5f56aa60a430244ffa0cacd81c0b1ecbf8d68f)
    Signed-off-by: Yin Huai <[email protected]>

commit d0307deaa29d5fcf1c675f9367c26aa6a3db3fba
Author: Timothy Hunter <[email protected]>
Date:   2015-12-10T20:50:46Z

    [SPARK-12212][ML][DOC] Clarifies the difference between spark.ml, 
spark.mllib and mllib in the documentation.
    
    Replaces a number of occurences of `MLlib` in the documentation that were 
meant to refer to the `spark.mllib` package instead. It should clarify for new 
users the difference between `spark.mllib` (the package) and MLlib (the 
umbrella project for ML in spark).
    
    It also removes some files that I forgot to delete with #10207
    
    Author: Timothy Hunter <[email protected]>
    
    Closes #10234 from thunterdb/12212.
    
    (cherry picked from commit 2ecbe02d5b28ee562d10c1735244b90a08532c9e)
    Signed-off-by: Joseph K. Bradley <[email protected]>

commit 9870e5c7af87190167ca3845ede918671b9420ca
Author: Josh Rosen <[email protected]>
Date:   2015-12-10T23:29:04Z

    [SPARK-12251] Document and improve off-heap memory configurations
    
    This patch adds documentation for Spark configurations that affect off-heap 
memory and makes some naming and validation improvements for those configs.
    
    - Change `spark.memory.offHeapSize` to `spark.memory.offHeap.size`. This is 
fine because this configuration has not shipped in any Spark release yet (it's 
new in Spark 1.6).
    - Deprecated `spark.unsafe.offHeap` in favor of a new 
`spark.memory.offHeap.enabled` configuration. The motivation behind this change 
is to gather all memory-related configurations under the same prefix.
    - Add a check which prevents users from setting 
`spark.memory.offHeap.enabled=true` when `spark.memory.offHeap.size == 0`. 
After SPARK-11389 (#9344), which was committed in Spark 1.6, Spark enforces a 
hard limit on the amount of off-heap memory that it will allocate to tasks. As 
a result, enabling off-heap execution memory without setting 
`spark.memory.offHeap.size` will lead to immediate OOMs. The new configuration 
validation makes this scenario easier to diagnose, helping to avoid user 
confusion.
    - Document these configurations on the configuration page.
    
    Author: Josh Rosen <[email protected]>
    
    Closes #10237 from JoshRosen/SPARK-12251.
    
    (cherry picked from commit 23a9e62bad9669e9ff5dc4bd714f58d12f9be0b5)
    Signed-off-by: Andrew Or <[email protected]>

commit c247b6a6546e12e3c6992c40cad1881d56aefd6f
Author: Andrew Or <[email protected]>
Date:   2015-12-10T23:30:08Z

    [SPARK-12155][SPARK-12253] Fix executor OOM in unified memory management
    
    **Problem.** In unified memory management, acquiring execution memory may 
lead to eviction of storage memory. However, the space freed from evicting 
cached blocks is distributed among all active tasks. Thus, an incorrect upper 
bound on the execution memory per task can cause the acquisition to fail, 
leading to OOM's and premature spills.
    
    **Example.** Suppose total memory is 1000B, cached blocks occupy 900B, 
`spark.memory.storageFraction` is 0.4, and there are two active tasks. In this 
case, the cap on task execution memory is 100B / 2 = 50B. If task A tries to 
acquire 200B, it will evict 100B of storage but can only acquire 50B because of 
the incorrect cap. For another example, see this [regression 
test](https://github.com/andrewor14/spark/blob/fix-oom/core/src/test/scala/org/apache/spark/memory/UnifiedMemoryManagerSuite.scala#L233)
 that I stole from JoshRosen.
    
    **Solution.** Fix the cap on task execution memory. It should take into 
account the space that could have been freed by storage in addition to the 
current amount of memory available to execution. In the example above, the 
correct cap should have been 600B / 2 = 300B.
    
    This patch also guards against the race condition (SPARK-12253):
    (1) Existing tasks collectively occupy all execution memory
    (2) New task comes in and blocks while existing tasks spill
    (3) After tasks finish spilling, another task jumps in and puts in a large 
block, stealing the freed memory
    (4) New task still cannot acquire memory and goes back to sleep
    
    Author: Andrew Or <[email protected]>
    
    Closes #10240 from andrewor14/fix-oom.
    
    (cherry picked from commit 5030923ea8bb94ac8fa8e432de9fc7089aa93986)
    Signed-off-by: Andrew Or <[email protected]>

commit 5d3722f8e5cdb4abd946ea18950225919af53a11
Author: jerryshao <[email protected]>
Date:   2015-12-10T23:31:46Z

    [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc
    
    With the merge of 
[SPARK-8337](https://issues.apache.org/jira/browse/SPARK-8337), now the Python 
API has the same functionalities compared to Scala/Java, so here changing the 
description to make it more precise.
    
    zsxwing tdas , please review, thanks a lot.
    
    Author: jerryshao <[email protected]>
    
    Closes #10246 from jerryshao/direct-kafka-doc-update.
    
    (cherry picked from commit 24d3357d66e14388faf8709b368edca70ea96432)
    Signed-off-by: Shixiong Zhu <[email protected]>

commit d09af2cb4237cca9ac72aacb9abb822a2982a820
Author: Davies Liu <[email protected]>
Date:   2015-12-11T01:22:18Z

    [SPARK-12258][SQL] passing null into ScalaUDF
    
    Check nullability and passing them into ScalaUDF.
    
    Closes #10249
    
    Author: Davies Liu <[email protected]>
    
    Closes #10259 from davies/udf_null.
    
    (cherry picked from commit b1b4ee7f3541d92c8bc2b0b4fdadf46cfdb09504)
    Signed-off-by: Yin Huai <[email protected]>

commit 3e39925f9296bc126adf3f6828a0adf306900c0a
Author: Patrick Wendell <[email protected]>
Date:   2015-12-11T02:45:36Z

    Preparing Spark release v1.6.0-rc2

commit 250249e26466ff0d6ee6f8ae34f0225285c9bb9b
Author: Patrick Wendell <[email protected]>
Date:   2015-12-11T02:45:42Z

    Preparing development version 1.6.0-SNAPSHOT

commit eec36607f9fc92b6c4d306e3930fcf03961625eb
Author: Davies Liu <[email protected]>
Date:   2015-12-11T19:15:53Z

    [SPARK-12258] [SQL] passing null into ScalaUDF (follow-up)
    
    This is a follow-up PR for #10259
    
    Author: Davies Liu <[email protected]>
    
    Closes #10266 from davies/null_udf2.
    
    (cherry picked from commit c119a34d1e9e599e302acfda92e5de681086a19f)
    Signed-off-by: Davies Liu <[email protected]>

commit 23f8dfd45187cb8f2216328ab907ddb5fbdffd0b
Author: Patrick Wendell <[email protected]>
Date:   2015-12-11T19:25:03Z

    Preparing Spark release v1.6.0-rc2

commit 2e4523161ddf2417f2570bb75cc2d6694813adf5
Author: Patrick Wendell <[email protected]>
Date:   2015-12-11T19:25:09Z

    Preparing development version 1.6.0-SNAPSHOT

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: Branch 1.6

Reply via email to