[ 
https://issues.apache.org/jira/browse/HIVE-20203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Wohlstadter updated HIVE-20203:
------------------------------------
    Description: 
ArrowColumnarBatchSerDe allocates an arrow NullableMapVector for each task that 
uses the serde.

The vector is a DirectByteBuffer allocated from Arrow's off-heap buffer pool.

This buffer is never closed and leaks about 1K of physical memory for each task.

This patch does three things:
 # Ensure the buffer is closed when the RecordWriter for the task is closed. 
 # Adds per-task memory accounting by assigning a ChildAllocator to each task 
from the RootAllocator.
 # Enforces that the ChildAllocator for a task has released all memory assigned 
to it, when the task is completed. 

The patch assumes that close() is always called on the RecordWriter when a task 
is finished (even if there is a failure during task execution). 

  was:
ArrowColumnarBatchSerDe allocates an arrow NullableMapVector for each task that 
uses the serde.

The vector is a DirectByteBuffer allocated from Arrow's off-heap buffer pool.

This buffer is never closed and leaks about 1K of physical memory for each task.

This patch does three things:
 # Ensure the buffer is closed when the RecordWriter for the task is closed. 
 # Adds per-task memory accounting by assigning a ChildAllocator to each task 
from the RootAllocator.
 # Enforces that the ChildAllocator for a task has released all memory assigned 
to it, when the task is completed. 

The patch assumes that close() is always called on the RecordWriter when a task 
is finished (even if their is a failure during task execution). 


> Arrow SerDe leaks a DirectByteBuffer
> ------------------------------------
>
>                 Key: HIVE-20203
>                 URL: https://issues.apache.org/jira/browse/HIVE-20203
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Eric Wohlstadter
>            Assignee: Eric Wohlstadter
>            Priority: Blocker
>         Attachments: HIVE-20203.1.patch
>
>
> ArrowColumnarBatchSerDe allocates an arrow NullableMapVector for each task 
> that uses the serde.
> The vector is a DirectByteBuffer allocated from Arrow's off-heap buffer pool.
> This buffer is never closed and leaks about 1K of physical memory for each 
> task.
> This patch does three things:
>  # Ensure the buffer is closed when the RecordWriter for the task is closed. 
>  # Adds per-task memory accounting by assigning a ChildAllocator to each task 
> from the RootAllocator.
>  # Enforces that the ChildAllocator for a task has released all memory 
> assigned to it, when the task is completed. 
> The patch assumes that close() is always called on the RecordWriter when a 
> task is finished (even if there is a failure during task execution). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to