[
https://issues.apache.org/jira/browse/GOBBLIN-1715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ratandeep Ratti updated GOBBLIN-1715:
-------------------------------------
Description:
The pre-allocation method allocates vastly more memory for ORC ColumnVectors of
arrays and maps than needed and is unpredictable as it depends upon the size of
the current column vector’s length, which can change as we allocate more memory
to it. From the heap dump done on a kafka topic we saw that on the second
resize call for an array ColumnVector, where request size was ~ 1k elements, it
had requested to allocate around 444M elements. This resulted in over
allocating way past the heap size. This was the primary reason why we see OOM
failures during ingestion for deeply nested records
Update: Below is an example of how a very large memory can be allocated using
smart resizing procedure. The formula for allocating memory is
{noformat}
child_vector resize =
child_vector_request_size +
(child_vector_request_size / rowsAdded + 1) * current_vector_size
{noformat}
If we now have deeply nested arrays of arrays each of 525 elements in a row
like The memory will be allocated as such.
{noformat}
1st resize = (525 + 525/1 + 1) * 256 = 135181 ; current vector size by default
is batch size = 256
2nd resize = (525 + 525/1 + 1) * 135181 = *71105731* ;
current vector size = 135181
{noformat}
was:
The pre-allocation method allocates vastly more memory for ORC ColumnVectors of
arrays and maps than needed and is unpredictable as it depends upon the size of
the current column vector’s length, which can change as we allocate more memory
to it. From the heap dump done on a kafka topic we saw that on the second
resize call for an array ColumnVector, where request size was ~ 1k elements, it
had requested to allocate around 444M elements. This resulted in over
allocating way past the heap size. This was the primary reason why we see OOM
failures during ingestion for deeply nested records
Update: Below is an example of how a very large memory can be allocated using
smart resizing procedure. The formula for allocating memory is
child_vector resize =
child_vector_request_size +
(child_vector_request_size / rowsAdded + 1) * current_vector_size
If we now have deeply nested arrays of arrays each of 525 elements in a row
like The memory will be allocated as such.
1st resize = (525 + 525/1 + 1) * 256 = 135181 ; current vector size by default
is batch size = 256
2nd resize = (525 + 525/1 + 1) * 135181 = *71105731* ;
current vector size = 135181
> Support vectorized row batch pooling
> ------------------------------------
>
> Key: GOBBLIN-1715
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1715
> Project: Apache Gobblin
> Issue Type: Bug
> Components: gobblin-core
> Reporter: Ratandeep Ratti
> Assignee: Abhishek Tiwari
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The pre-allocation method allocates vastly more memory for ORC ColumnVectors
> of arrays and maps than needed and is unpredictable as it depends upon the
> size of the current column vector’s length, which can change as we allocate
> more memory to it. From the heap dump done on a kafka topic we saw that on
> the second resize call for an array ColumnVector, where request size was ~ 1k
> elements, it had requested to allocate around 444M elements. This resulted in
> over allocating way past the heap size. This was the primary reason why we
> see OOM failures during ingestion for deeply nested records
> Update: Below is an example of how a very large memory can be allocated using
> smart resizing procedure. The formula for allocating memory is
> {noformat}
> child_vector resize =
> child_vector_request_size +
> (child_vector_request_size / rowsAdded + 1) * current_vector_size
> {noformat}
> If we now have deeply nested arrays of arrays each of 525 elements in a row
> like The memory will be allocated as such.
> {noformat}
> 1st resize = (525 + 525/1 + 1) * 256 = 135181 ; current vector size by
> default is batch size = 256
> 2nd resize = (525 + 525/1 + 1) * 135181 = *71105731*
> ; current vector size = 135181
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)