GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/11613

    [SPARK-10907][SPARK-6157][WIP] Remove pendingUnrollMemory from MemoryStore

    This patch refactors the MemoryStore to remove the concept of 
`pendingUnrollMemory`. It also fixes fixes SPARK-6157: "Unrolling with 
MEMORY_AND_DISK should always release memory".
    
    Key changes:
    
    - Inline `MemoryStore.tryToPut` at its three call sites in the 
`MemoryStore`.
    - Inline `Memory.unrollSafely` at its only call site (in 
`MemoryStore.putIterator`).
    - Inline `MemoryManager.acquireStorageMemory` at its call sites.
    - Simplify the code as a result of this inlining (some parameters have 
fixed values after inlining, so lots of branches can be removed).
    - Remove the `pendingUnrollMemory` map by returning the amount of 
unrollMemory allocated when returning an iterator after a failed `putIterator` 
call.
    - Change `putIterator` to return an instance of 
`PartiallyUnrolledIterator`, a special iterator subclass which will 
automatically free the unroll memory of its partially-unrolled elements when 
the iterator is consumed. To handle cases where the iterator is not consumed 
(e.g. when a MEMORY_ONLY put fails), `PartiallyUnrolledIterator` exposes a 
`close()` method which may be called to discard the unrolled values and free 
their memory.
    
    This patch is marked WIP because it's currently rebased on top of #11534 
and needs additional doc, comment, and test updates before it will be ready to 
merge. Here's a link to the actual diff: 
https://github.com/apache/spark/compare/66796b5...JoshRosen:cleanup-unroll-memory

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark cleanup-unroll-memory

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11613.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11613
    
----
commit 3d51b00b5724aa94bf5ed7967f881de5988c5b52
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T01:07:59Z

    Delete the BlockStore interface.

commit 14d003ffe06ac763e128a1b3fbee8fb22fb2cc17
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T01:13:14Z

    Remove unused DiskStore.getBytes() overloads.

commit 1a764c05a58c5ef299508569ac4096b5fa0468d9
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T01:14:36Z

    DiskStore.getBytes() never returns None, so it shouldn't return an Option.

commit 9814ba6b479850ca9c198af1556e16017a769280
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T01:31:02Z

    Simplify DiskStore.putIterator's return type.

commit 14d5652107cdfc8c32e94231cfc2919339d2a2a5
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T01:33:03Z

    Remove MemoryStore.putIterator() overload.

commit 5c294ace7dcdabf4e325367866eac18aa6e16efb
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T01:39:06Z

    DiskStore put() methods don't need to take a StorageLevel.

commit c8d0e695f473590bed1cc318f98c8445834c00a6
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T02:05:23Z

    Factor common error-handling code in DiskStore.put*() into helper function.

commit 46b3877a462a19b9284990e05830f9782629235d
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T02:09:45Z

    Remove DiskStore.putIterator().

commit 27cee47a6228e470ab54891fd1d2dec676a49938
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T02:21:57Z

    Remove DiskStore's dependency on BlockManager.

commit 1a50c8115f31a1acd3b0a4dd5e1b3d28e81d05f5
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T02:25:31Z

    Minor simplifications in DiskStoreSuite test.

commit f3b60052c42215d1440cc5660bdea443bed5598e
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T02:29:18Z

    Remove outdated comment in DiskBlockManager.

commit 2d86e290f77c15aa8ba6e6e46f6c39987ee351a5
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T02:39:25Z

    Remove DiskStore's dependency on BlockManager.

commit 9e3ae78f62310aa291833842113c9832ac520bfa
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T03:28:49Z

    Shorten period of holding memoryManager lock.

commit d8487d4e8ee5bb5b64d60f1158fc7420ac6a2a54
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T03:48:35Z

    MemoryStore.put() no longer handles dropping to disk.
    
    This is now handled by the caller.

commit 10a667d62642ab478ab00b5e0267be93d6b01417
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T04:06:36Z

    MemoryStore.putBytes() shouldn't perform deserialization.

commit 87e775d585d2db7c91af9c2587df2eb395040248
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T19:31:46Z

    MemoryStore should take its own conf, not obtain it from BlockManager.

commit 2923850c27931cd8efb49449b19438e82763c39e
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T19:53:47Z

    Move MemoryManager into new o.a.s.storage.memory package

commit 40f4e436e2d99eebc41b5f8703936f8497b9443c
Author: Josh Rosen <[email protected]>
Date:   2016-03-04T22:33:28Z

    getBytes() and getValues() no longer implicitly serialize / deserialize.

commit 495ad976699ab05a8b452c39c65ebcc13c1718db
Author: Josh Rosen <[email protected]>
Date:   2016-03-05T00:00:00Z

    Split doGetLocal() and getLocal() into smaller, simpler methods.

commit 032e3a3b62e70b653a97bb2353c85087f9e4f843
Author: Josh Rosen <[email protected]>
Date:   2016-03-05T19:05:50Z

    Fix scalastyle violations.

commit 988f00393676eabfc11e665f20f9ce26388e4c11
Author: Josh Rosen <[email protected]>
Date:   2016-03-05T21:02:38Z

    Fix leaked lock in getOrElseUpdate() when block already exists.

commit 31a500834bab30d3a162885bfc88b08d2c7ffb0f
Author: Josh Rosen <[email protected]>
Date:   2016-03-08T19:13:44Z

    Document lock requirements of doGetLocalBytes

commit ca5a3f30fdf74694fa9bf5e1352133df3051257e
Author: Josh Rosen <[email protected]>
Date:   2016-03-08T19:15:26Z

    Add clarifying comment to doGetLocalBytes()

commit 14857b3e53d1b496b7ff07099c53ab2d775f950a
Author: Josh Rosen <[email protected]>
Date:   2016-03-08T19:17:51Z

    Remove unnecessary putBlockInfo.synchronized call

commit 92c5125f2f4736335971e779fc39e9fa74f8c310
Author: Josh Rosen <[email protected]>
Date:   2016-03-08T23:34:05Z

    Remove effectiveStorageLevel from put() APIs.

commit 7a08a179f8951abbdcb7e70f6bfb53821fbc7352
Author: Josh Rosen <[email protected]>
Date:   2016-03-08T23:59:34Z

    Split doPut() into doPutBytes() and doPutIterator().

commit a16276e1cfb92fba88f2963b0d918aa14f2500a4
Author: Josh Rosen <[email protected]>
Date:   2016-03-09T00:01:46Z

    Remove unreachable level == StorageLevel.NONE case.
    
    This is unreachable because we check whether level.isValid earlier in the 
same method.

commit 82886e03f60e2b4b0b97b0c6ae640ca10dede145
Author: Josh Rosen <[email protected]>
Date:   2016-03-09T00:03:43Z

    Fix statement without side-effects.

commit 66796b5bf89ddedc9644b9f5692441293c0c0aaa
Author: Josh Rosen <[email protected]>
Date:   2016-03-09T00:15:11Z

    Minor comment reword.

commit dbd164e5d7d147b39993b60390adf0a6b84c0ac8
Author: Josh Rosen <[email protected]>
Date:   2016-03-08T19:45:50Z

    Make unrollSafely private.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to