[jira] [Comment Edited] (HDDS-1785) OOM error in Freon due to the concurrency handling

Xudong Cao (JIRA) Wed, 17 Jul 2019 03:50:31 -0700


    [ 
https://issues.apache.org/jira/browse/HDDS-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16886777#comment-16886777
 ]


Xudong Cao edited comment on HDDS-1785 at 7/17/19 10:49 AM:
------------------------------------------------------------

{color:#333333}1. Deep size of below 3 objects (deep size is the combined size 
of the specified object and all objects on its reference tree), use API 
RamUsageEstimator.SizeOf(Object obj):{color}
 # {color:#333333}VolumeProcessor: 2 621 008{color}
 # BucketProcessor: 3 071 776
 # KeyProcessor: 3 083 024

2. Shallow size of below 3 objects (shallow size is the size of the object 
itself in the heap space{color:#333333}), use API 
RamUsageEstimator.shallowSizeOf(Object obj):{color}
 # {color:#333333}VolumeProcessor: 24{color}
 # {color:#333333}BucketProcessor: 24{color}
 # {color:#333333}KeyProcessor: 24{color}

{color:#333333}3. Their deepsize is s{color}o large mainly because these three 
classes are all internal classes of the RandomKeyGenerator, they all contain a 
'this' reference to the RandomKeyGenerator, which is the largest part of the 
size. So in actual test, we mainly consider the shallow size of them, 
especially the shallow size of keyProcessor (because it is the object having 
largest number).
 # In the case of 1 million keys, keyProcessors will occupy 24*1 million = 23MB 
of memory.
 # In the case of 100 million keys, keyProcessors will occupy 24*100 million = 
2.3 GB of memory.
 # The above result is the worst case, meaning that no keyProcessor goes to be 
consumed until all of them have been submited. But in fact, generally the 
KeyProcessors are consumed while submitting, so the total memory usage should 
be smaller.

4. This memory usage is actually not a lot, so I think there is generally no 
OOM situation, But if the magnitude of the key is large, and the memory of the 
test machine is relatively small, there may be an OOM error.

5. Finally, I think this patch is very good, no longer waiting for all the 
tasks to be generated before processing, so that it will not cause a backlog of 
tasks, and fundamentally solve the problem that the memory usage may be too 
large.


was (Author: xudongcao):
{color:#333333}1. Deep size of below 3 objects (deep size is the combined size 
of the specified object and all objects on its reference tree), use API 
RamUsageEstimator.SizeOf(Object obj):{color}
 # {color:#333333}VolumeProcessor: 2 621 008{color}
 # BucketProcessor: 3 071 776
 # KeyProcessor: 3 083 024

2. Shallow size of below 3 objects (shallow size is the size of the object 
itself in the heap space), use API RamUsageEstimator.shallowSizeOf(Object obj):
 # VolumeProcessor: 24
 # {color:#d04437}BucketProcessor: 24{color}
 # {color:#d04437}KeyProcessor: 24{color}

3. Their deepsize is so large mainly because these three classes are all 
internal classes of the RandomKeyGenerator, they all contain a 'this' reference 
to the RandomKeyGenerator, which is the largest part of the size. So in actual 
test, we mainly consider the shallow size of them, especially the shallow size 
of keyProcessor (because it is the object having largest number).
 # In the case of 1 million keys, keyProcessors will occupy 24*1 million = 23MB 
of memory.
 # In the case of 100 million keys, keyProcessors will occupy 24*100 million = 
2.3 GB of memory.
 # The above result is the worst case, meaning that no keyProcessor goes to be 
consumed until all of them have been submited. But in fact, generally the 
KeyProcessors are consumed while submitting, so the total memory usage should 
be smaller.

4. This memory usage is actually not a lot, so I think there is generally no 
OOM situation, But if the magnitude of the key is large, and the memory of the 
test machine is relatively small, there may be an OOM error.

5. Finally, I think this patch is very good, no longer waiting for all the 
tasks to be generated before processing, so that it will not cause a backlog of 
tasks, and fundamentally solve the problem that the memory usage may be too 
large.

> OOM error in Freon due to the concurrency handling
> --------------------------------------------------
>
>                 Key: HDDS-1785
>                 URL: https://issues.apache.org/jira/browse/HDDS-1785
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Elek, Marton
>            Assignee: Doroszlai, Attila
>            Priority: Blocker
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> HDDS-1532 modified the concurrent framework usage of Freon 
> (RandomKeyGenerator).
> The new approach uses separated tasks (Runnable) to create the 
> volumes/buckets/keys.
> Unfortunately it doesn't work very well in some cases.
>  # When Freon starts it creates an executor with fixed number of threads (10)
>  # The first loop submits numOfVolumes (10) VolumeProcessor tasks to the 
> executor
>  # The 10 threads starts to execute the 10 VolumeProcessor tasks
>  # Each VolumeProcessor tasks creates numOfBuckets (1000) BucketProcessor 
> tasks. All together 10000 tasks are submitted to the executor.
>  # The 10 threads starts to execute the first 10 BucketProcessor tasks, they 
> starts to create the KeyProcessor tasks: 500 000 * 10 tasks are submitted.
>  # At this point of the time no keys are generated, but the next 10 
> BucketProcessor tasks are started to execute..
>  # To execute the first key creation we should process all the 
> BucketProcessor tasks which means that all the Key creation tasks (10 * 1000 
> * 500 000) are created and added to the executor
>  # Which requires a huge amount of time and memory



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HDDS-1785) OOM error in Freon due to the concurrency handling

Reply via email to