[ 
https://issues.apache.org/jira/browse/HDDS-11939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChenXi updated HDDS-11939:
--------------------------
    Description: 
h2. Phenomena

We have found a memory leak in DataStreamMapImpl when writing with Stream 
online.
This problem can be reproduced on the current master branch

DataStreamMapImpl even after the write request stops for a long time, will 
still hold a very large number of DataStream, these DataStream unless 
restarted, or more and more, can not be released

!image-2024-12-15-22-21-26-187.png|width=993,height=758!

!image-2024-12-15-21-14-27-612.png|width=1209,height=474!
h2. Reproduction method
h3. Starting the cluster in Steaming mode

refer to: 
[https://ozone.apache.org/docs/edge/feature/streaming-write-pipeline.html]
h3. execute a command

 
{code:bash}
for i in `seq 1 10`; do ozone freon ommg --operation CREATE_STREAM_FILE -n 100 
-t 100 --size=1M --volume s3v  --bucket bucket1 --duration 5;done
{code}
Another model of reproduction, use timeout to force end the write, this will 
generate more leak in the DataStreamMapImpl#map
{code:bash}
for i in `seq 1 10`; timeout 10 do ozone freon ommg --operation 
CREATE_STREAM_FILE -n 100 -t 100 --size=1M --volume s3v  --bucket bucket1 
--duration 100;done
{code}
 

 

Note: -t 100 number of client threads 100, this is the key to reproduction, 
must be multi-threaded client to reproduce the leak
h3. h3. DataStreamMapImpl 

At this point in the DataStreamMapImpl has been left in some DataStream but 
because there is no log, so can not be directly observed,

I added a log in the DataStreamMapImpl#remove to facilitate the observation, 
this is a screenshot of my logs

!image-2024-12-15-22-35-16-703.png|width=851,height=382!

!image-2024-12-15-22-18-24-185.png|width=1816,height=920!
h4.  
h3. h3. Netty's mem

And Netty's direct mem keeps growing, not dropping.

Before
!image-2024-12-15-22-12-41-704.png|width=675,height=120!

After
!image-2024-12-15-22-14-22-904.png|width=735,height=108!

 

 
h3. h3. Other

The `cleanUpOnChannelInactive` method does not clean up a leaked `DataStream` 
in `DataStreamMapImpl.`
If you use a single-threaded client test, the leak does not occur

 
{code:java}
for i in `seq 1 10`; do ozone freon ommg --operation CREATE_STREAM_FILE -n 100 
-t 1 --size=1M --volume s3v --bucket bucket1 --duration 5;done {code}
 

  was:
h2. Phenomena

We have found a memory leak in DataStreamMapImpl when writing with Stream 
online.
This problem can be reproduced on the current master branch

DataStreamMapImpl even after the write request stops for a long time, will 
still hold a very large number of DataStream, these DataStream unless 
restarted, or more and more, can not be released

!image-2024-12-15-22-21-26-187.png|width=993,height=758!

!image-2024-12-15-21-14-27-612.png|width=1209,height=474!
h2. Reproduction method
h3. Starting the cluster in Steaming mode

refer to: 
[https://ozone.apache.org/docs/edge/feature/streaming-write-pipeline.html]
h3. execute a command

 
{code:bash}
for i in `seq 1 10`; do ozone freon ommg --operation CREATE_STREAM_FILE -n 100 
-t 100 --size=1M --volume s3v  --bucket bucket1 --duration 5;done
{code}
Note: -t 100 number of client threads 100, this is the key to reproduction, 
must be multi-threaded client to reproduce the leak
h3. h3. DataStreamMapImpl 

At this point in the DataStreamMapImpl has been left in some DataStream but 
because there is no log, so can not be directly observed,

I added a log in the DataStreamMapImpl#remove to facilitate the observation, 
this is a screenshot of my logs

!image-2024-12-15-22-35-16-703.png|width=851,height=382!

!image-2024-12-15-22-18-24-185.png|width=1816,height=920!
h4.  
h3. h3. Netty's mem

And Netty's direct mem keeps growing, not dropping.

Before
!image-2024-12-15-22-12-41-704.png|width=675,height=120!

After
!image-2024-12-15-22-14-22-904.png|width=735,height=108!

 

 
h3. h3. Other

The `cleanUpOnChannelInactive` method does not clean up a leaked `DataStream` 
in `DataStreamMapImpl.`
If you use a single-threaded client test, the leak does not occur

 
{code:java}
for i in `seq 1 10`; do ozone freon ommg --operation CREATE_STREAM_FILE -n 100 
-t 1 --size=1M --volume s3v --bucket bucket1 --duration 5;done {code}
 


> Ratis Memory leak of DataStreamMapImpl when Stream write
> --------------------------------------------------------
>
>                 Key: HDDS-11939
>                 URL: https://issues.apache.org/jira/browse/HDDS-11939
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: ChenXi
>            Priority: Critical
>         Attachments: image-2024-12-15-21-14-27-612.png, 
> image-2024-12-15-22-12-41-704.png, image-2024-12-15-22-14-22-904.png, 
> image-2024-12-15-22-18-24-185.png, image-2024-12-15-22-21-26-187.png, 
> image-2024-12-15-22-35-16-703.png
>
>
> h2. Phenomena
> We have found a memory leak in DataStreamMapImpl when writing with Stream 
> online.
> This problem can be reproduced on the current master branch
> DataStreamMapImpl even after the write request stops for a long time, will 
> still hold a very large number of DataStream, these DataStream unless 
> restarted, or more and more, can not be released
> !image-2024-12-15-22-21-26-187.png|width=993,height=758!
> !image-2024-12-15-21-14-27-612.png|width=1209,height=474!
> h2. Reproduction method
> h3. Starting the cluster in Steaming mode
> refer to: 
> [https://ozone.apache.org/docs/edge/feature/streaming-write-pipeline.html]
> h3. execute a command
>  
> {code:bash}
> for i in `seq 1 10`; do ozone freon ommg --operation CREATE_STREAM_FILE -n 
> 100 -t 100 --size=1M --volume s3v  --bucket bucket1 --duration 5;done
> {code}
> Another model of reproduction, use timeout to force end the write, this will 
> generate more leak in the DataStreamMapImpl#map
> {code:bash}
> for i in `seq 1 10`; timeout 10 do ozone freon ommg --operation 
> CREATE_STREAM_FILE -n 100 -t 100 --size=1M --volume s3v  --bucket bucket1 
> --duration 100;done
> {code}
>  
>  
> Note: -t 100 number of client threads 100, this is the key to reproduction, 
> must be multi-threaded client to reproduce the leak
> h3. h3. DataStreamMapImpl 
> At this point in the DataStreamMapImpl has been left in some DataStream but 
> because there is no log, so can not be directly observed,
> I added a log in the DataStreamMapImpl#remove to facilitate the observation, 
> this is a screenshot of my logs
> !image-2024-12-15-22-35-16-703.png|width=851,height=382!
> !image-2024-12-15-22-18-24-185.png|width=1816,height=920!
> h4.  
> h3. h3. Netty's mem
> And Netty's direct mem keeps growing, not dropping.
> Before
> !image-2024-12-15-22-12-41-704.png|width=675,height=120!
> After
> !image-2024-12-15-22-14-22-904.png|width=735,height=108!
>  
>  
> h3. h3. Other
> The `cleanUpOnChannelInactive` method does not clean up a leaked `DataStream` 
> in `DataStreamMapImpl.`
> If you use a single-threaded client test, the leak does not occur
>  
> {code:java}
> for i in `seq 1 10`; do ozone freon ommg --operation CREATE_STREAM_FILE -n 
> 100 -t 1 --size=1M --volume s3v --bucket bucket1 --duration 5;done {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to