[ 
https://issues.apache.org/jira/browse/SPARK-35396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chendi.Xue updated SPARK-35396:
-------------------------------
    Description: 
This PR is proposing a add-on to support to manual close entries in MemoryStore 
and InMemoryRelation
h3. What changes were proposed in this pull request?

Currently:
MemoryStore uses a LinkedHashMap[BlockId, MemoryEntry[_]] to store all OnHeap 
or OffHeap entries.
And when memoryStore.remove(blockId) is called, codes will simply remove one 
entry from LinkedHashMap and leverage Java GC to do release work.

This PR:
We are proposing a add-on to manually close any object stored in MemoryStore 
and InMemoryRelation if this object is extended from AutoCloseable.

Veifiication:
In our own use case, we implemented a user-defined off-heap-hashRelation for 
BHJ, and we verified that by adding this manual close, we can make sure our 
defined off-heap-hashRelation can be released when evict is called.
Also, we implemented user-defined cachedBatch and will be release when 
InMemoryRelation.clearCache() is called by this PR
h3. Why are the changes needed?

This changes can help to clean some off-heap user-defined object may be cached 
in InMemoryRelation or MemoryStore
h3. Does this PR introduce _any_ user-facing change?

NO
h3. How was this patch tested?

WIP

Signed-off-by: Chendi Xue [[email protected]|mailto:[email protected]]

  was:
Current MemoryStore uses a LinkedHashMap[BlockId, MemoryEntry[_]] to store all 
OnHeap or OffHeap entries.

And when memoryStore.remove(blockId) is called, codes will simply remove one 
entry from LinkedHashMap and leverage Java GC to do release work.

We are proposing a add-on, if this object is extends from AutoCloseable, then 
we can call this object's close() directly in MemoryStore.clear() and 
MemoryStore.remove() function.

In our case, we are implementing a user-defined off-heap-hashRelation for BHJ, 
and we verified that by adding this manual close, we can make sure our defined 
off-heap-hashRelation can be released when evict is called.

At same, time, we also want to add this logic to InMemoryRelation to manual 
close a user-defined CachedBatch


> Support to manual close entries in MemoryStore and InMemoryRelation
> -------------------------------------------------------------------
>
>                 Key: SPARK-35396
>                 URL: https://issues.apache.org/jira/browse/SPARK-35396
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core, SQL
>    Affects Versions: 3.1.1
>            Reporter: Chendi.Xue
>            Priority: Major
>
> This PR is proposing a add-on to support to manual close entries in 
> MemoryStore and InMemoryRelation
> h3. What changes were proposed in this pull request?
> Currently:
> MemoryStore uses a LinkedHashMap[BlockId, MemoryEntry[_]] to store all OnHeap 
> or OffHeap entries.
> And when memoryStore.remove(blockId) is called, codes will simply remove one 
> entry from LinkedHashMap and leverage Java GC to do release work.
> This PR:
> We are proposing a add-on to manually close any object stored in MemoryStore 
> and InMemoryRelation if this object is extended from AutoCloseable.
> Veifiication:
> In our own use case, we implemented a user-defined off-heap-hashRelation for 
> BHJ, and we verified that by adding this manual close, we can make sure our 
> defined off-heap-hashRelation can be released when evict is called.
> Also, we implemented user-defined cachedBatch and will be release when 
> InMemoryRelation.clearCache() is called by this PR
> h3. Why are the changes needed?
> This changes can help to clean some off-heap user-defined object may be 
> cached in InMemoryRelation or MemoryStore
> h3. Does this PR introduce _any_ user-facing change?
> NO
> h3. How was this patch tested?
> WIP
> Signed-off-by: Chendi Xue [[email protected]|mailto:[email protected]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to